You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The root cause here looks to be something with the stdio redirection resulting in twice the expected data being available.
I tried to manually call sox to see how it was producing audio.
Experiment results:
sox.exe -c 1 -b 16 -e signed-integer -r 16000 -t waveaudio default -p > redirect.wav
Ran for 10s.
redirect.wav is 655,408 redirect.wav
Had Sox write the file directly:
sox.exe -c 1 -b 16 -e signed-integer -r 16000 -t waveaudio default redirect2.wav
Ran for 10s.
This output 327,724 redirect2.wav
That tells me the doubling of the data is happening as a result of the stdio redirect. It's not clear why that's happening, but the possibility that the doubling is platform specific causes fragility concerns. Plus who knows what extra data is winding up in the audio.
#40 May be the root cause here, piping audio out of sox forces the format to be 32 bit audio, which may gives appearance of it generating double the data when set to 16 bit.
I am trying to recognize the user voice continuously, but I am always getting wrong results. Have anybody done something like this?
I will add some parts of my code so you can understand.
Here is how I create an instance of pushStream (MS Speech SDK)
Here is the method I call to recognize the user voice
And here is where I use the mic package to get the user voice data
The text was updated successfully, but these errors were encountered: