-
-
Notifications
You must be signed in to change notification settings - Fork 217
Whisper Advanced Parameters
jhj0517 edited this page Sep 25, 2023
·
6 revisions
Parameter | Description |
---|---|
beam_size |
Parameter used in the beam search algorithm. TLDR; Higher beam size, higher quality but slower transcription. Smaller beam size, lower quality but faster transcription. |
log_prob_threshold |
Parameter related to how whisper handles the "silent" part of the audio. If the average log probability over sampled tokens is below this value, treat as failed. TLDR; Lower this value if you want Whisper to be more "sensitive" to small sounds. Adjust together with no_speech_threshold and see what happens.
|
no_speech_threshold |
Parameter related to how Whisper handles the "silent" part of the audio. If the no_speech probability is higher than this value AND the average log probability over sampled tokens is below log_prob_threshold , consider the segment as silent. TLDR; Lower this value if you want Whisper to be more "sensitive" to small sounds. Adjust together with log_prob_threshold and see what happens.
|
compute_type |
Compute type such as float16 or float32 . default to float16 if CUDA is enabled, else float32 . |