Skip to content

v1.3.0

Compare
Choose a tag to compare
@shashikg shashikg released this 28 Jan 16:02
· 32 commits to main since this release

Release Notes

  1. Support for TensorRT-LLM Backend
  2. Inclusion of Example Notebooks

TensorRT-LLM Backend

WhisperS2T now offers compatibility with NVIDIA's TensorRT-LLM: https://github.com/NVIDIA/TensorRT-LLM backend, delivering a further twofold improvement in inference time compared to the CTranslate2 backend. The current optimal configuration on an A30 GPU achieves transcription of 1-hour files in approximately 18 seconds. Updated benchmarks are detailed below:

benchmarks_v1_3_0