Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we remove ffmpeg? #3

Open
sloganking opened this issue Oct 22, 2023 · 1 comment
Open

Should we remove ffmpeg? #3

sloganking opened this issue Oct 22, 2023 · 1 comment
Labels
question Further information is requested

Comments

@sloganking
Copy link
Owner

ffmpeg currently converts the recorded wav files to mp3 files. This is because wav files have no compression, and the OpenAI api has a hard limit that files must be under 25 MB.

https://platform.openai.com/docs/guides/speech-to-text/longer-inputs

By default, the Whisper API only supports files that are less than 25 MB. If you have an audio file that is longer than that, you will need to break it up into chunks of 25 MB's or less or used a compressed audio format.

If I knew how to save audio recording to mp3 files in rust. We could get around this. But I've only seen an example of writing to wav files so far. See:

https://github.com/RustAudio/cpal/blob/master/examples/record_wav.rs

@sloganking sloganking added the question Further information is requested label Oct 22, 2023
@jonassmedegaard
Copy link

jonassmedegaard commented Mar 28, 2024

Since the audio files contain speech, it is far far more efficient to compress using Ogg/Opus than as mp3, and seems supported by OpenAI (seemingly the API supports the container format Ogg, and the code itself supports whatever ffmpeg supports which - depending on how it is compiled - includes the Opus codec).

So I would suggest to look at opus and crates depending on that for inspiration on how to save as opus - with far lighter dependencies than the ffmpeg giant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants