Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support distil-whisper #411

Merged
merged 9 commits into from
Nov 6, 2023
Merged

Conversation

csukuangfj
Copy link
Collaborator

Support models from https://github.com/huggingface/distil-whisper

You can find a colab for it at

Open In Colab


Usage

  1. Install sherpa-onnx
pip install sherpa-onnx
  1. Use the model
mkdir sherpa-onnx-whisper-distil-medium.en
cd sherpa-onnx-whisper-distil-medium.en
wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-distil-medium.en/resolve/main/distil-medium.en-decoder.int8.onnx
wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-distil-medium.en/resolve/main/distil-medium.en-encoder.int8.onnx
wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-distil-medium.en/resolve/main/distil-medium.en-tokens.txt
wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-distil-medium.en/resolve/main/test_wavs/0.wav
wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-distil-medium.en/resolve/main/test_wavs/1.wav
wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-distil-medium.en/resolve/main/test_wavs/8k.wav
ls -lh

sherpa-onnx-offline \
  --whisper-encoder=./distil-medium.en-encoder.int8.onnx \
  --whisper-decoder=./distil-medium.en-decoder.int8.onnx \
  --tokens=./distil-medium.en-tokens.txt \
  ./0.wav \
  ./1.wav \
  ./8k.wav

The output is given below:

total 548M
-rw-r--r-- 1 root root 208K Nov  6 14:18 0.wav
-rw-r--r-- 1 root root 523K Nov  6 14:18 1.wav
-rw-r--r-- 1 root root  76K Nov  6 14:18 8k.wav
-rw-r--r-- 1 root root 234M Nov  6 12:16 distil-medium.en-decoder.int8.onnx
-rw-r--r-- 1 root root 313M Nov  6 12:32 distil-medium.en-encoder.int8.onnx
-rw-r--r-- 1 root root 816K Nov  6 14:18 distil-medium.en-tokens.txt
/project/sherpa-onnx/csrc/parse-options.cc:Read:361 sherpa-onnx-offline --whisper-encoder=./distil-medium.en-encoder.int8.onnx --whisper-decoder=./distil-medium.en-decoder.int8.onnx --tokens=./distil-medium.en-tokens.txt ./0.wav ./1.wav ./8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./distil-medium.en-encoder.int8.onnx", decoder="./distil-medium.en-decoder.int8.onnx", language="", task="transcribe"), tdnn=OfflineTdnnModelConfig(model=""), zipformer_ctc=OfflineZipformerCtcModelConfig(model=""), tokens="./distil-medium.en-tokens.txt", num_threads=2, debug=False, provider="cpu", model_type=""), lm_config=OfflineLMConfig(model="", scale=0.5), ctc_fst_decoder_config=OfflineCtcFstDecoderConfig(graph="", max_active=3000), decoding_method="greedy_search", max_active_paths=4, hotwords_file="", hotwords_score=1.5)
Creating recognizer ...
Started
/project/sherpa-onnx/csrc/offline-stream.cc:AcceptWaveformImpl:113 Creating a resampler:
   in_sample_rate: 8000
   output_sample_rate: 16000

Done!

./0.wav
{"text": " After early nightfall, the yellow lamps would light up here and there the squalid quarter of the brothels.", "timestamps": [], "tokens":[" After", " early", " night", "fall", ",", " the", " yellow", " lamps", " would", " light", " up", " here", " and", " there", " the", " squ", "alid", " quarter", " of", " the", " bro", "the", "ls", "."]}
----
./1.wav
{"text": " God, as a direct consequence of the sin which man thus punished, had given her a lovely child whose place was on that same dishonored bosom to connect her parent forever with the race and descent of mortals, and to be finally a blessed soul in heaven.", "timestamps": [], "tokens":[" God", ",", " as", " a", " direct", " consequence", " of", " the", " sin", " which", " man", " thus", " punished", ",", " had", " given", " her", " a", " lovely", " child", " whose", " place", " was", " on", " that", " same", " dishon", "ored", " bos", "om", " to", " connect", " her", " parent", " forever", " with", " the", " race", " and", " descent", " of", " mortals", ",", " and", " to", " be", " finally", " a", " blessed", " soul", " in", " heaven", "."]}
----
./8k.wav
{"text": " Yet these thoughts affected Hester Pren less with hope than apprehension.", "timestamps": [], "tokens":[" Yet", " these", " thoughts", " affected", " H", "ester", " P", "ren", " less", " with", " hope", " than", " apprehension", "."]}
----
num threads: 2
decoding method: greedy_search
Elapsed seconds: 49.245 s
Real time factor (RTF): 49.245 / 28.165 = 1.748

@csukuangfj csukuangfj merged commit a65cdc3 into k2-fsa:master Nov 6, 2023
32 of 33 checks passed
@csukuangfj csukuangfj deleted the distilled-whisper branch November 6, 2023 14:33
@egorsmkv
Copy link

egorsmkv commented Nov 6, 2023

How to run it using GPU?

@csukuangfj
Copy link
Collaborator Author

@egorsmkv
Please see
https://github.com/k2-fsa/colab/blob/master/sherpa-onnx/Run_distil_whisper_with_sherpa_onnx_on_GPU_ipynb.ipynb

@egorsmkv
Copy link

egorsmkv commented Nov 7, 2023

Thank you!

@A-Raafat
Copy link

A-Raafat commented Dec 4, 2023

would distil-large-v2 be converted same way?
https://huggingface.co/distil-whisper/distil-large-v2

@csukuangfj
Copy link
Collaborator Author

https://huggingface.co/distil-whisper/distil-large-v2

I am afraid it can't. The reason is that the large model is so large that it exceeds the file limit of onnx.

@Goddard
Copy link

Goddard commented Dec 7, 2023

https://huggingface.co/distil-whisper/distil-large-v2

I am afraid it can't. The reason is that the large model is so large that it exceeds the file limit of onnx.

Not sure if this interests you, but it could resolve the issues? microsoft/onnxruntime#15349

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants