You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ffmpeg currently converts the recorded wav files to mp3 files. This is because wav files have no compression, and the OpenAI api has a hard limit that files must be under 25 MB.
By default, the Whisper API only supports files that are less than 25 MB. If you have an audio file that is longer than that, you will need to break it up into chunks of 25 MB's or less or used a compressed audio format.
If I knew how to save audio recording to mp3 files in rust. We could get around this. But I've only seen an example of writing to wav files so far. See:
Since the audio files contain speech, it is far far more efficient to compress using Ogg/Opus than as mp3, and seems supported by OpenAI (seemingly the API supports the container format Ogg, and the code itself supports whatever ffmpeg supports which - depending on how it is compiled - includes the Opus codec).
So I would suggest to look at opus and crates depending on that for inspiration on how to save as opus - with far lighter dependencies than the ffmpeg giant.
ffmpeg
currently converts the recordedwav
files tomp3
files. This is becausewav
files have no compression, and the OpenAI api has a hard limit that files must be under 25 MB.https://platform.openai.com/docs/guides/speech-to-text/longer-inputs
If I knew how to save audio recording to
mp3
files in rust. We could get around this. But I've only seen an example of writing towav
files so far. See:https://github.com/RustAudio/cpal/blob/master/examples/record_wav.rs
The text was updated successfully, but these errors were encountered: