[Bug]: Can get stuck during musical sections. #170

MCCMikey · 2024-07-10T12:18:14Z

What happened?

I supplied it a two hour recording of a radio program.

For about 30 minutes of the recording it repeatedly output the line [Music] rather than transcribing the spoken words between tracks.

From about the 40 minute mark it resumed normal output, except for one point where it repeated the line

They're only living in a world where they can say goodbye.

about 40 times.

I have to say though that it does a remarkably good job with such varied content. If this can be fixed I plan to ask this program to periodically verify that our presenters are playing the sponsor messages that they are meant to on our community radio station. I'm definitely going to pay for this app via the support link. I've been looking for something like this for ages.

2020-09-24 Transcription.docx

Steps to reproduce

Feed it the audio file https://drive.google.com/file/d/1nqtLWZTUEJjvVWGDUJRQTRdyJNGbCESM/view?usp=sharing and ask it to transcribe.

What OS are you seeing the problem on?

Window

Relevant log output

No response

thewh1teagle · 2024-07-11T00:17:40Z

For about 30 minutes of the recording it repeatedly output the line [Music] rather than transcribing the spoken words between tracks.

I understand that you've encountered challenges transcribing audio with music and background noise. Unfortunately, the Whisper AI model isn't the best fit for this task, as discussed here.

I propose combining a VAD AI model (Voice Activity Detector) with a denoiser model (for noise filtering and speech enhancement). I'd love to hear what other developers think about this approach—please feel free to share your thoughts.

It's worth noting that this isn't a simple task, and I don't believe there's an existing solution for this worldwide, at least not in the non-commercial realm.

For the VAD, we can utilize the Silero VAD model with Sherpa-rs, and for speech enhancement, we can leverage DeepFilterNet

I have to say though that it does a remarkably good job with such varied content. If this can be fixed I plan to ask this program > to periodically verify that our presenters are playing the sponsor messages that they are meant to on our community radio station. I'm definitely going to pay for this app via the support link. I've been looking for something like this for ages.

I'm glad you liked it! and thank you very much for your support in improving Vibe it's greatly appreciated!

thewh1teagle · 2024-12-04T13:48:46Z

Closing as duplicate of #402

MCCMikey added the bug Something isn't working label Jul 10, 2024

thewh1teagle closed this as completed Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Can get stuck during musical sections. #170

[Bug]: Can get stuck during musical sections. #170

MCCMikey commented Jul 10, 2024

thewh1teagle commented Jul 11, 2024 •

edited

Loading

thewh1teagle commented Dec 4, 2024

[Bug]: Can get stuck during musical sections. #170

[Bug]: Can get stuck during musical sections. #170

Comments

MCCMikey commented Jul 10, 2024

What happened?

Steps to reproduce

What OS are you seeing the problem on?

Relevant log output

thewh1teagle commented Jul 11, 2024 • edited Loading

thewh1teagle commented Dec 4, 2024

thewh1teagle commented Jul 11, 2024 •

edited

Loading