Whisper transcription app big performance regression #712

thundergolfer · 2024-04-20T17:14:58Z

https://modal-com.slack.com/archives/C069RAH7X4M/p1713624663717089

thundergolfer · 2024-04-20T17:18:30Z

A one hour podcast used to take ~1 minute, so big drop in performance.

thundergolfer · 2024-05-01T03:14:01Z

I think first thing to do is to replace the use of NFS

ahxxm · 2024-05-18T01:23:39Z

would be great if the official example uses WhisperX, it can transcribe one hour podcast in 1 minute using only 1 container(or more specific, 1 graphi card with 16G vram, using large-v3), instead of spins up 100-300 containers for a single transcription

ahxxm · 2024-05-18T12:53:49Z

made a poc repo here https://github.com/ahxxm/serverless-audio-transcriber

ahxxm · 2024-10-05T10:32:24Z

I've been using my own dogfood for a while, this is how it looks like with A10G

The audio files I sent range from 30 minutes to 70 minutes, and I scheduled a 5 minute interval

This translates to around $0.01~$0.02 per hour of transcription, wonder how does the current official approch look like? Before and after regression, will it be cheaper and faster?

thundergolfer · 2024-10-05T16:04:33Z

Thanks @ahxxm this is awesome, especially the bit where you've listed benchmarks. It looks like Runpod is the cheapest? I'd bet that the
RTX A4500 $/hr rate is cheaper than our A10G.

ahxxm · 2024-10-05T22:51:07Z

Yeah A4500 is a new, performant(for Whisper) and cheap(Modal T4 price) one, would be great if Modal also supports that.

I think it's comparable but not a completely fair comparison, Runpod has FlashBoot that loads weights within 1 second instead of 20+ seconds, so I added a bit more CPU and memory to Modal codes

gongouveia · 2024-11-08T15:27:31Z

@thundergolfer @ahxxm hello, thank your for this very informative thread. I would like to ask how could I use my own model using modal, how can i send my file to the container image?

ahxxm · 2024-11-09T09:50:36Z

@gongouveia I doubt this is relevant to the issue, but are you asking about how to send custom model/weights into Docker image, or send the file you want to transcribe(see all examples in this repo or mine then)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper transcription app big performance regression #712

Whisper transcription app big performance regression #712

thundergolfer commented Apr 20, 2024

thundergolfer commented Apr 20, 2024

thundergolfer commented May 1, 2024

ahxxm commented May 18, 2024 •

edited

Loading

ahxxm commented May 18, 2024

ahxxm commented Oct 5, 2024

thundergolfer commented Oct 5, 2024 •

edited

Loading

ahxxm commented Oct 5, 2024

gongouveia commented Nov 8, 2024

ahxxm commented Nov 9, 2024

Whisper transcription app big performance regression #712

Whisper transcription app big performance regression #712

Comments

thundergolfer commented Apr 20, 2024

thundergolfer commented Apr 20, 2024

thundergolfer commented May 1, 2024

ahxxm commented May 18, 2024 • edited Loading

ahxxm commented May 18, 2024

ahxxm commented Oct 5, 2024

thundergolfer commented Oct 5, 2024 • edited Loading

ahxxm commented Oct 5, 2024

gongouveia commented Nov 8, 2024

ahxxm commented Nov 9, 2024

ahxxm commented May 18, 2024 •

edited

Loading

thundergolfer commented Oct 5, 2024 •

edited

Loading