-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen1.5-MoE-A2.7B-Chat w4a16 Quantization Failed #189
Labels
bug
Something isn't working
Comments
the above code works on release v0.1.0 |
Hi @donpromax Thank you for filing an issue; I used
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
I tried to quantize Qwen1.5-MoE-A2.7B-Chat with w4a16 for vllm PR: vllm-project/vllm#7766
raise error TypeError: forward() got multiple values for argument 'attention_mask'
Expected behavior
A clear and concise description of what you expected to happen.
Environment
Include all relevant environment information:
f7245c8
]: main3fb4212f
To Reproduce
Exact steps to reproduce the behavior:
My code
Errors
If applicable, add a full print-out of any errors or exceptions that are raised or include screenshots to help explain your problem.
Additional context
Add any other context about the problem here. Also include any relevant files.
The text was updated successfully, but these errors were encountered: