how to use this model with whisper.cpp #6

stopthinking102 · 2024-08-14T22:23:00Z

can u share which parameter to set to use this parameter in whisper.cpp.
is it the n_audio_ctx paramter with 1500 referring to 1.5 seconds. Would have been great if there was an android sample.

// medium
// hparams: {
// 'n_mels': 80,
// 'n_vocab': 51864,
// 'n_audio_ctx': 1500,
// 'n_audio_state': 1024,
// 'n_audio_head': 16,
// 'n_audio_layer': 24,
// 'n_text_ctx': 448,
// 'n_text_state': 1024,
// 'n_text_head': 16,
// 'n_text_layer': 24
// }
//
// default hparams (Whisper tiny)
struct whisper_hparams {
int32_t n_vocab = 51864;
int32_t n_audio_ctx = 1500;
int32_t n_audio_state = 384;

abb128 · 2024-08-14T23:49:15Z

You can run whisper.cpp main and set -ac to a number between 1 and 1500. It's not in milliseconds, rather it's 50 per second. So if you have a 30 second clip (maximum), it's 1500. If you have a 5 second clip, it's 5 * 50 = 250. It also usually helps to add a constant of around 64, so you'd do 5 * 50 + 64 = 314.

Programmatically you can set audio_context field of whisper_full_params

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to use this model with whisper.cpp #6

how to use this model with whisper.cpp #6

stopthinking102 commented Aug 14, 2024

abb128 commented Aug 14, 2024

how to use this model with whisper.cpp #6

how to use this model with whisper.cpp #6

Comments

stopthinking102 commented Aug 14, 2024

abb128 commented Aug 14, 2024