-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen2-VL Batch Bug #2495
Comments
Hi @LugerW-A , it supports batch inference, and you need to follow the batch process provided by official QWen2-VL, please see more info at: https://github.com/QwenLM/Qwen2-VL?tab=readme-ov-file , like: |
@sunnyqgg HI, referring to the above writing method, the second output is empty. Have you printed the second output result? |
HI @sun2011yao do you specify the --batch_size when running with multi batch? |
yes, run the command as follows: |
Hi, |
HI, i removed --image_path, but second result still empty. Can you get the correct results there? |
@sunnyqgg Hi, have you reproduced this problem? |
I guess I met the same error.
@sunnyqgg Hi, I run command like follows And I got an error: `IndexError: index 1 is out of bounds for dimension 0 with size 1 . This error is raised in https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/runtime/multimodal_model_runner.py#L1111 Therefore I add following code to https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/runtime/multimodal_model_runner.py#L1741
And the second result it empty. Anyone have an idea about this error. Thanks. |
@sunnyqgg HI, I found some doubts about the following line of code. should it be modified to TensorRT-LLM/cpp/tensorrt_llm/kernels/unfusedAttentionKernels/unfusedAttentionKernels_2_template.h Line 761 in 4420547
|
Hi,
Since I don't have your code, can you check the shape of input_ids , mrope_params and output_ids in the file tensorrt_llm/runtime/multimodal_model_runner.py
|
@sunnyqgg Hi, thank you for your reply. This question is a bit strange. Below are my complete running steps and related modifications. Can you help check it?
Modifications to multimodal_model_runner.py: input_ids shape: [2, 917] output text: |
I have meet similiar case with @sun2011yao ,when 2 images as input for qwenvl2-72b, the first return text describe 2 images at same time, the second text is absolutely wrong for both 2 images |
My case is a bit different. After fix as described above, the same two prompts can normally generate the same result. |
@sunnyqgg Hi, with above modifications, it seems that, for two different prompts, one prompt generates results of both two images while another generate empty. I am not sure is there something wrong with myself. |
@sunnyqgg follow the modifications, it seems does not work , two results describe both images at same time |
Hi, attention_mask_vit = torch.full([1, seq_length, seq_length],
torch.finfo(torch.float16).min,
device=image.device,
dtype=image.dtype)
for i in range(1, len(cu_seqlens)):
attention_mask_vit[..., cu_seqlens[i - 1]:cu_seqlens[i],
cu_seqlens[i - 1]:cu_seqlens[i]] = 0 Thanks. |
System Info
x86
Tensorrt_LLM 0.16.0
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Qwen2-VL examples
Expected behavior
Dose Qwen2-VL support batch prompt?
When the input is a batch, only the first result returns correctly, while the rest are all empty.
print(input_ids.shape)
print(prompt_table.shape)
print(prompt_tasks)
outputs = self.model.generate(
input_ids,
input_position_ids=None,
mrope_params=mrope_params,
sampling_config=None,
prompt_table=prompt_table,
prompt_tasks=prompt_tasks,
max_new_tokens=max_new_tokens,
end_id=end_id,
pad_id=self.model.tokenizer.pad_token_id
if self.model.tokenizer.pad_token_id is not None else
self.model.tokenizer.all_special_ids[0],
top_k=self.args.top_k,
top_p=self.args.top_p,
temperature=self.args.temperature,
repetition_penalty=self.args.repetition_penalty,
num_beams=self.args.num_beams,
output_sequence_lengths=True,
return_dict=True)
actual behavior
input_ids only differ in the first dimension, but the results are incorrect(empty).
additional notes
none
The text was updated successfully, but these errors were encountered: