-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to convert quantized ONNX model from Tensor-Oriented format to Operator-Oriented format? #21137
Comments
This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details. |
I have the same question. Hi @hoangtv2000 do you find the solution? |
@UsingtcNower and @hoangtv2000, you can run the onnxruntime offline-mode to optimize the model to Operator oriented: https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html#offline-mode. BTW, why do you want to do this? |
I want to use your method to run on our self-designed chip which runs efficiently by integer arithmetic-only format. |
You should use built-in static post-training quantization function of onnxruntime library quantize_static, remember to set quant_format to |
Describe the issue
I have the quantized model represented as the graph below and I want to convert all of QDQ operators in this model to QOperator operators, what should I do?
To reproduce
have not repoduced yet.
Urgency
No response
Platform
Linux
OS Version
Ubuntu 22.04
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.15.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Tasks
The text was updated successfully, but these errors were encountered: