You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if is_per_channel:
self.quantizer.quantize_weight_tensor_per_channel(tensor_name, channel_axis)
else:
self.quantizer.quantize_activation_tensor(tensor_name)
Apparently, there is a lack of handling for weights in per_tensor mode here, which results in the weight type and activation type being the same for the per_tensor quantized MatMul operator.
To reproduce
The issue can be reproduced by using the relevant files in demo.zip. The reproduction commands are as follows,
### Description
<!-- Describe your changes. -->
Fix wrong per-tensor quantized weight type for matmul.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix related bug as described in
#21346
Describe the issue
I am using
quantize_static
for quantization, and I found the weight type ofMatMul
is wrong when the activation type is different from the weight type inper_tensor
mode . I have located the relevant lines of code as follows:https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/tools/quantization/operators/matmul.py#L225C1-L228C71
Apparently, there is a lack of handling for weights in
per_tensor
mode here, which results in the weight type and activation type being the same for theper_tensor
quantizedMatMul
operator.To reproduce
The issue can be reproduced by using the relevant files in demo.zip. The reproduction commands are as follows,
which will produce a quantized model with 16-bit weights of matmul. ❌
which will produce a quantized model with 8-bit weights of matmul. ❌
Urgency
No response
Platform
Linux
OS Version
Ubuntu 22.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: