v1.3.0

shashikg released this 28 Jan 16:02

· 32 commits to main since this release

v1.3.0

fae33f7

Release Notes

Support for TensorRT-LLM Backend
Inclusion of Example Notebooks

TensorRT-LLM Backend

WhisperS2T now offers compatibility with NVIDIA's TensorRT-LLM: https://github.com/NVIDIA/TensorRT-LLM backend, delivering a further twofold improvement in inference time compared to the CTranslate2 backend. The current optimal configuration on an A30 GPU achieves transcription of 1-hour files in approximately 18 seconds. Updated benchmarks are detailed below:

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.3.0

Release Notes

TensorRT-LLM Backend