Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jsk_perception] add speech_recognition #2055

Closed
wants to merge 1 commit into from

Conversation

furushchev
Copy link
Member

@k-okada
Copy link
Member

k-okada commented Apr 17, 2017 via email

@furushchev
Copy link
Member Author

@k-okada Sorry for late reply.

  1. How stable that pip software

I don't know because I'm not speech recognition guy, but this seems to be used widely in python.
ref: http://pypi-ranking.info/search/speech/

  1. What is the plan for old speech recognition node?

I think all of these have advantages / disadvantages each other.

  • Android
    • ◎ Free use of google rich speech recognition api
    • ◎ No permission to recognize on each speech
    • ◎ Automatically segments each phrases spoken
    • △ No template based recognition
    • △ We need android and keep specific application open
    • △ We need to connect each time to rosmaster.
  • Web UI
    • ◎ Free use of google rich speech recognition api
    • ◎ i18n
    • ◎ Automatically segments each phrases spoken
    • ◎ Cross Platform
    • △ No template based recognition
    • △ We need Chrome and keep specific site open
    • △ We need permission on each speech by GUI (not programmable)
  • Python node (proposed in this PR)
    • ◎ Multiple selectable APIs
    • ◎ Supports template based speech recognition
    • ◎ We can use microphone of robots for voice input
    • ◎ We don't need GUI or special device
    • ◎ Support offline recognition (If you choose sphinx for recognition engine)
    • △ Less maintained compared to other methods
    • △ Some APIs are not free
    • △ VAD(Voice Activity Detection) is relatively poor.
  1. Is there any chance to share same message between old and new nodes?

For only dictation based continuous recognition, Yes.
I now know that there are several type of speech recognition and needs. (e.g. Template based recognition, Service call way to recognize for conversation)

What is the future plan for this node? How long are you going to
support/maintain this?

I apologize that I made multiple nodes for speech recognition, but since for speech recognition I think there is no defacto standard yet, I think we must make something to use it for each cases if it is needed.

  1. Please consider looking into other voice based robot (pepper?) to share
    common interface

OK. I'll take a look at pepper's interface.

  1. If you stil need this node, add rosdep entry.

This could be problematic. We currently need to install pyaudio module from pip not from apt (this is too old) for activating VAD.
I'll find out a workaround to avoid this problem though.

@furushchev
Copy link
Member Author

Closed due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants