-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
onnxruntime quantization weights not tied #21277
Comments
You can try running the quantization preprocess and then call the quantization script. It should resolve the issue:
|
by using preprocess, it raised an error
|
Can you please check this @tianleiwu |
@yufenglee, please look at the quantization tool issue. |
The symbolic_shape_infer fails. You can disable the shape inference with option: --skip_symbolic_shape |
after using this, the model size increase from 113MB to 265MB, this is not expected. |
Describe the issue
I have a model with tied weights, but after quantization, one branch is replaced with quantized weights, but another still remains the float weights.
To reproduce
from onnxruntime.quantization import quantize_dynamic, QuantType
Urgency
No response
Platform
Linux
OS Version
ubuntu 2004
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: