[jsk_perception] add speech_recognition #2055

furushchev · 2017-04-17T11:14:03Z

needed by jsk-ros-pkg/jsk_demos#1218
CC: @wkentaro

k-okada · 2017-04-17T13:52:39Z

1. How stable that pip software 2. What is the plan for old speech recognition node? 3. Is there any chance to share same message between old and new nodes? 4. What is the future plan for this node? How long are you going to support/maintain this? 5. Please consider looking into other voice based robot (pepper?) to share common interface 6. If you stil need this node, add rosdep entry.

furushchev · 2017-04-21T14:05:40Z

@k-okada Sorry for late reply.

How stable that pip software

I don't know because I'm not speech recognition guy, but this seems to be used widely in python.
ref: http://pypi-ranking.info/search/speech/

What is the plan for old speech recognition node?

I think all of these have advantages / disadvantages each other.

Android
- ◎ Free use of google rich speech recognition api
- ◎ No permission to recognize on each speech
- ◎ Automatically segments each phrases spoken
- △ No template based recognition
- △ We need android and keep specific application open
- △ We need to connect each time to rosmaster.
Web UI
- ◎ Free use of google rich speech recognition api
- ◎ i18n
- ◎ Automatically segments each phrases spoken
- ◎ Cross Platform
- △ No template based recognition
- △ We need Chrome and keep specific site open
- △ We need permission on each speech by GUI (not programmable)
Python node (proposed in this PR)
- ◎ Multiple selectable APIs
- ◎ Supports template based speech recognition
- ◎ We can use microphone of robots for voice input
- ◎ We don't need GUI or special device
- ◎ Support offline recognition (If you choose sphinx for recognition engine)
- △ Less maintained compared to other methods
- △ Some APIs are not free
- △ VAD(Voice Activity Detection) is relatively poor.

Is there any chance to share same message between old and new nodes?

For only dictation based continuous recognition, Yes.
I now know that there are several type of speech recognition and needs. (e.g. Template based recognition, Service call way to recognize for conversation)

What is the future plan for this node? How long are you going to
support/maintain this?

I apologize that I made multiple nodes for speech recognition, but since for speech recognition I think there is no defacto standard yet, I think we must make something to use it for each cases if it is needed.

Please consider looking into other voice based robot (pepper?) to share
common interface

OK. I'll take a look at pepper's interface.

If you stil need this node, add rosdep entry.

This could be problematic. We currently need to install pyaudio module from pip not from apt (this is too old) for activating VAD.
I'll find out a workaround to avoid this problem though.

furushchev · 2017-06-30T13:28:16Z

Closed due to inactivity.

[jsk_perception] add speech_recognition

fed2b4e

wkentaro added feature pkg/jsk_perception labels May 2, 2017

furushchev closed this Jun 30, 2017

furushchev deleted the speech-recognition branch September 6, 2017 16:05

furushchev mentioned this pull request Sep 6, 2017

add ros_speech_recognition package jsk-ros-pkg/jsk_3rdparty#121

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[jsk_perception] add speech_recognition #2055

[jsk_perception] add speech_recognition #2055

furushchev commented Apr 17, 2017

k-okada commented Apr 17, 2017 via email •

edited by wkentaro

Loading

furushchev commented Apr 21, 2017

furushchev commented Jun 30, 2017

[jsk_perception] add speech_recognition #2055

[jsk_perception] add speech_recognition #2055

Conversation

furushchev commented Apr 17, 2017

k-okada commented Apr 17, 2017 via email • edited by wkentaro Loading

furushchev commented Apr 21, 2017

furushchev commented Jun 30, 2017

k-okada commented Apr 17, 2017 via email •

edited by wkentaro

Loading