-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to have it always listening without start/stop buttons? #73
Comments
Right now, there are two ways you can do that:
The third option would be to use simpler network to detect activation word (like "Alexa" or "Siri") and only start whisper speech recognition after that. However, there is no build-in solution for word spotter. |
Thanks. With the Streaming Input solution, it seems the streaming stops after it being enabled for ~1 minute, so OnStreamFinished I tried stopping the recording and starting the stream and recording so then when ever the streaming stops, it will be started back up, but this caused it to constantly be stopping and starting after the initial streaming stop after ~1 minute. This solution also seemed to freeze the editor every so often. Also after the first few segments, it started taking like 20 seconds to run the OnFinishSegment() despite me only talking for a second and it only taking 1-2 seconds when it was first started. With the second solution, I tried out the Voice Commands Demo PR, but its very delayed. Sometimes it was taking 2 seconds to complete the inferencing, other times it took 12 seconds, although your test video seems to be nearly instant. I'm sure your PC specs are better than mine, but 12 seconds to inference two words doesn't seem right. |
Streaming example scene should have
What model weights do you use (tiny, base, large, etc)? Could you share your hardware specs? Do you use CPU or GPU inference? |
Ah, I never noticed that Loop option, that should fix the issue. I'm using the Tiny model, my CPU is a Ryzen 5 2600 and GTX 970 GPU (obviously not the best specs, but it shouldn't deliver such unreliable results like it is), and I'm using what ever is default, I don't see an option to set it to use GPU or CPU. |
You can try to use CUDA inference. It might be faster on your hardware, but you would need to install CUDA toolkit. You can also try to enable "Speed Up" setting in Finally, you can play around with streaming settings, like |
I would like to add voice commands to my game, how could I have it always listening without having to click a start and stop button?
I did try checking the volume from the mic using audioClip.GetData, but that seems to break after I run microphoneRecord.StartRecord().
How could this be done? Thanks!
The text was updated successfully, but these errors were encountered: