-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen2 VL cannot be convert to checkpoint on TensorRT-LLM #2658
Comments
@sunnyqgg would u please take a look this issue? |
Hi, Thanks. |
I tried to rebuild the docker image with latest source code on the main branch. The checkpoint converting has been fixed for Qwen2-VL. However, the run.py seems still not working for Qwen2-VL. I have tried But got Any suggestion here? Thanks! |
The issue happens here: |
HI @xunuohope1107 , Thanks. |
Yeah, i have checked |
Do you mean modify the code like this:
change |
Do I still need to install the source code based on the code submitted by 21fac7? Can I use the latest version of transformers directly? |
System Info
Who can help?
I have tested the examples under examples/multimodal. But when I try to convert the Qwen2-VL-7B to checkpoint via
python3 ../qwen/convert_checkpoint.py --model_dir Qwen2-VL-7B-Instruct \ --output_dir trt_models/Qwen2-VL-7B-Instruct/fp16/1-gpu \ --dtype float16
, I got the errorUnrecognized keys in
rope_scalingfor 'rope_type'='default': {'mrope_section'}
, seems the Qwen2-VL is not supported. Is it due to the docker image I used or I have build the trtllm from the source?Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
python3 ../qwen/convert_checkpoint.py --model_dir Qwen2-VL-7B-Instruct \ --output_dir trt_models/Qwen2-VL-7B-Instruct/fp16/1-gpu \ --dtype float16
Expected behavior
Got trt_models/Qwen2-VL-7B-Instruct/fp16/1-gpu without any errors.
actual behavior
Got error log:
root@04292e29d243:/workspace/TensorRT-LLM/examples/multimodal# python3 ../qwen/convert_checkpoint.py --model_dir Qwen2-VL-7B-Instruct \ --output_dir trt_models/Qwen2-VL-7B-Instruct/fp16/1-gpu \ --dtype float16 2025-01-03 11:20:24.426668: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable
TF_ENABLE_ONEDNN_OPTS=0. 2025-01-03 11:20:24.441389: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1735903224.456763 2272 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered E0000 00:00:1735903224.461320 2272 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2025-01-03 11:20:24.477010: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 AVX512_FP16 AVX_VNNI AMX_TILE AMX_INT8 AMX_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. [TensorRT-LLM] TensorRT-LLM version: 0.16.0 0.16.0 Unrecognized keys in
rope_scalingfor 'rope_type'='default': {'mrope_section'} Unrecognized keys in
rope_scalingfor 'rope_type'='default': {'mrope_section'} Traceback (most recent call last): File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/functional.py", line 656, in from_string return RotaryScalingType[s] ~~~~~~~~~~~~~~~~~^^^ File "/usr/lib/python3.12/enum.py", line 814, in __getitem__ return cls._member_map_[name] ~~~~~~~~~~~~~~~~^^^^^^ KeyError: 'default'
additional notes
I have tried Phi-3 vision, Qwen2-7B-instruct as well, both of them works.
The text was updated successfully, but these errors were encountered: