You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
VAD tends to detect false-positive voice on long silence segments. It looks like, if whole audio chunk contains only silence, energy-based VAD detects speech in it.
You can fix this by changing VadContextSec in MicrophoneRecord to high values. While VAD is pretty cheap, its context can't grow indefinitely and will cause lag spikes.
Another solution is to replace VAD with something more robust (like silero-vad). But I don't want to include any extra dependencies, especially that not using ggml.
I think it can be fixed by some hack, but I don't know what that might be.
The text was updated successfully, but these errors were encountered:
VAD tends to detect false-positive voice on long silence segments. It looks like, if whole audio chunk contains only silence, energy-based VAD detects speech in it.
You can fix this by changing
VadContextSec
inMicrophoneRecord
to high values. While VAD is pretty cheap, its context can't grow indefinitely and will cause lag spikes.Another solution is to replace VAD with something more robust (like silero-vad). But I don't want to include any extra dependencies, especially that not using ggml.
I think it can be fixed by some hack, but I don't know what that might be.
The text was updated successfully, but these errors were encountered: