Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about int8 weighs? #608

Open
AntonThai2022 opened this issue Jun 11, 2024 · 3 comments
Open

What about int8 weighs? #608

AntonThai2022 opened this issue Jun 11, 2024 · 3 comments

Comments

@AntonThai2022
Copy link

hello!
I build int8 weights:
INFERENCE_PRECISION=float16
WEIGHT_ONLY_PRECISION=int8
MAX_BEAM_WIDTH=4
MAX_BATCH_SIZE=8
checkpoint_dir=whisper_large_v3_weights_${WEIGHT_ONLY_PRECISION}
output_dir=whisper_large_v3_${WEIGHT_ONLY_PRECISION}

Convert the large-v3 model weights into TensorRT-LLM format.

python3 convert_checkpoint.py
--use_weight_only
--weight_only_precision $WEIGHT_ONLY_PRECISION
--output_dir $checkpoint_dir

So I got whisper_large_v3_weights_int8 and put it in sherpa/triton/whisper/model_repo_whisper_trtllm/whisper/1. But it does not work.
I tried to change name on whisper_large_v3, but it did not help)))
Is it real to launch int8 whisper in your repo and docker image?

@csukuangfj
Copy link
Collaborator

@yuekaizhang Could you have a look? Thanks!

@yuekaizhang
Copy link
Collaborator

hello! I build int8 weights: INFERENCE_PRECISION=float16 WEIGHT_ONLY_PRECISION=int8 MAX_BEAM_WIDTH=4 MAX_BATCH_SIZE=8 checkpoint_dir=whisper_large_v3_weights_${WEIGHT_ONLY_PRECISION} output_dir=whisper_large_v3_${WEIGHT_ONLY_PRECISION}

Convert the large-v3 model weights into TensorRT-LLM format.

python3 convert_checkpoint.py --use_weight_only --weight_only_precision $WEIGHT_ONLY_PRECISION --output_dir $checkpoint_dir

So I got whisper_large_v3_weights_int8 and put it in sherpa/triton/whisper/model_repo_whisper_trtllm/whisper/1. But it does not work. I tried to change name on whisper_large_v3, but it did not help))) Is it real to launch int8 whisper in your repo and docker image?

@AntonThai2022 Would you mind pasting the error logs here?

@AntonThai2022
Copy link
Author

I apologize for the created topic - I just mixed up the folder with intermediate weights and engine weights. I sat down and took the desired folder and renamed it - everything worked. It’s strange that the regular version takes 8GB, and 8 bit 7GB. Although the acceleration was almost doubled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants