Skip to content

DakeQQ/Automatic-Speech-Recognition-ASR-ONNX

Repository files navigation


Automatic-Speech-Recognition-ASR-ONNX

Harness the power of ONNX Runtime to transcribe audio into text effortlessly.

Supported Models

  1. Single Model:
  2. Combined Models (ASR + Speaker Identify):

Features

  • End-to-end speech recognition with built-in STFT processing.
    Input: Audio file
    Output: Transcription result
  • Seamlessly integrate with these additional tools for improved performance:
  • This Whisper does not support automatic language detection. Please specify a target language.

Learn More


性能 Performance

OS Device Backend Model Real-Time Factor
(Chunk Size: 128000 or 8s)
Ubuntu 24.04 Laptop CPU
i5-7300HQ
SenseVoiceSmall
f32
0.037
Ubuntu 24.04 Laptop CPU
i5-7300HQ
SenseVoiceSmall
q8f32
0.075
Ubuntu 24.04 Desktop CPU
i3-12300
SenseVoiceSmall
f32
0.019
Ubuntu 24.04 Desktop CPU
i3-12300
SenseVoiceSmall
q8f32
0.022
Ubuntu 24.04 Desktop CPU
i3-12300
SenseVoiceSmall +
ERes2NetV2_w24s4ep4
f32
0.10
Ubuntu 24.04 Desktop CPU
i3-12300
Whisper-Large-v3-en
q8f32
0.15
Ubuntu 24.04 Desktop CPU
i3-12300
Whisper-Large-v3-Turbo-en
q8f32
0.073
Ubuntu 24.04 Laptop CPU
i5-7300HQ
Paraformer-Small-Chinese
f32
0.04
Ubuntu 24.04 Laptop CPU
i5-7300HQ
Paraformer-Large-English
q8f32
0.14

Coming Soon 🚀

  • None

自动语音识别(ASR)ONNX

利用 ONNX Runtime 实现音频到文本的高效转录。

支持模型

  1. 单模型

  2. 组合模型 (ASR + 讲话者识别)

功能特点

  • 端到端语音识别,内置 STFT 处理。
    输入:音频文件
    输出:转录结果
  • 推荐搭配以下工具,提升性能:
  • 此 Whisper 不支持自动语言检测。请指定目标语言。

了解更多


About

Utilizes ONNX Runtime to transcribe audio into text.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages