issues with longer audio files and from_pretrained() #1562

benniekiss · 2023-11-24T15:59:17Z

benniekiss
Nov 24, 2023

I've been noticing that with audio files over 5 hours, about halfway through processing, the CPU and GPU usage drops off, but the pipeline still runs for about an equal length of time. For example, for 3 hours of audio, the processing will be about 2 minutes on the GPU, then 2 minutes of low cpu and no gpu usage before the result is returned. I was wondering if there was a specific model in the pipeline that doesn't utilize the GPU, or if there was another step that was the cause for this processing bottleneck.

Another issue is that for audio longer than 8 hours, the process crashes in my jupyter notebook running on a RTX 4090 through runpod. The same dynamic happens--about halfway through (about 4 minutes), the CPU and GPU usage drop off, and then the process goes for another 4 to 6 minutes before crashing. Is there a limit to how much pyannote can process at once? splitting the audio would be trivial, but I was curious about pyannote's limits and wanted to avoid further dividing the audio if possible, especially to retain speaker_ids.

I've had great success with audio less than 8 hours.

hbredin · 2023-11-24T16:16:19Z

hbredin
Nov 24, 2023
Maintainer

Using a ProgressHook may help you figure out in which part of the pipeline this happens.
See this documented feature here.

3 replies

benniekiss Nov 24, 2023
Author

thank you! I'll test it with the hook.

benniekiss Nov 24, 2023
Author

Running pipeline with the hook does not put out any information in the jupyter notebook. All I get is Output() when running it per the documentation.

EDIT: nevermind, I had to restart vscode and now it's displaying properly!

benniekiss Nov 24, 2023
Author

the issue happens after the embedding step completes. It never shows the next step, discrete_diarizations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issues with longer audio files and from_pretrained() #1562

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

issues with longer audio files and from_pretrained() #1562

benniekiss Nov 24, 2023

Replies: 1 comment · 3 replies

hbredin Nov 24, 2023 Maintainer

benniekiss Nov 24, 2023 Author

benniekiss Nov 24, 2023 Author

benniekiss Nov 24, 2023 Author

benniekiss
Nov 24, 2023

Replies: 1 comment 3 replies

hbredin
Nov 24, 2023
Maintainer

benniekiss Nov 24, 2023
Author

benniekiss Nov 24, 2023
Author

benniekiss Nov 24, 2023
Author