VAD false-positive on long segments #50

Macoron · 2023-08-14T20:14:50Z

VAD tends to detect false-positive voice on long silence segments. It looks like, if whole audio chunk contains only silence, energy-based VAD detects speech in it.

You can fix this by changing VadContextSec in MicrophoneRecord to high values. While VAD is pretty cheap, its context can't grow indefinitely and will cause lag spikes.

Another solution is to replace VAD with something more robust (like silero-vad). But I don't want to include any extra dependencies, especially that not using ggml.

I think it can be fixed by some hack, but I don't know what that might be.

The text was updated successfully, but these errors were encountered:

Macoron added enhancement New feature or request bug Something isn't working and removed enhancement New feature or request labels Aug 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VAD false-positive on long segments #50

VAD false-positive on long segments #50

Macoron commented Aug 14, 2023 •

edited

Loading

VAD false-positive on long segments #50

VAD false-positive on long segments #50

Comments

Macoron commented Aug 14, 2023 • edited Loading

Macoron commented Aug 14, 2023 •

edited

Loading