Whisper Advanced Parameters

Advanced Parameters

Parameter	Description
`beam_size`	Parameter used in the beam search algorithm. TLDR; Higher beam size, higher quality but slower transcription. Smaller beam size, lower quality but faster transcription.
`log_prob_threshold`	Parameter related to how whisper handles the "silent" part of the audio. If the average log probability over sampled tokens is below this value, treat as failed. TLDR; Lower this value if you want Whisper to be more "sensitive" to small sounds. Adjust together with `no_speech_threshold` and see what happens.
`no_speech_threshold`	Parameter related to how Whisper handles the "silent" part of the audio. If the `no_speech probability` is higher than this value AND the average log probability over sampled tokens is below `log_prob_threshold`, consider the segment as silent. TLDR; Lower this value if you want Whisper to be more "sensitive" to small sounds. Adjust together with `log_prob_threshold` and see what happens.
`compute_type`	Compute type such as `float16` or `float32`. default to `float16` if CUDA is enabled, else `float32`.
`condition_on_previous_text`	If True, the previous output of the model is provided as a prompt for the next window; disabling may make the text inconsistent across windows, but the model becomes less prone to getting stuck in a failure loop, such as repetition looping or timestamps going out of sync. TLDR; If failure loop (repetitive hallucination) occurs, consider setting this to False.