Fix onnx quantizer activation and weight type attribute #17651

echarlaix · 2023-09-21T12:56:08Z

In quantize_subgraph self.weight_qType and self.activation_qType are integers while ONNXQuantizer expects QuantType

fxmarty · 2023-09-21T13:09:10Z

cc @tianleiwu @yufenglee this looks like a regression for the quantization of models with subgraphs.

tianleiwu · 2023-09-28T06:19:18Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

tianleiwu · 2023-09-28T06:19:25Z

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, ONNX Runtime React Native CI Pipeline Windows x64 QNN CI Pipeline

azure-pipelines · 2023-09-28T06:19:53Z

Azure Pipelines successfully started running 7 pipeline(s).

azure-pipelines · 2023-09-28T06:19:57Z

Azure Pipelines successfully started running 9 pipeline(s).

yufenglee

yufenglee · 2023-09-28T21:56:22Z

/azp run Windows x64 QNN CI Pipeline

azure-pipelines · 2023-09-28T21:56:32Z

Azure Pipelines successfully started running 1 pipeline(s).

In [`quantize_subgraph`](https://github.com/microsoft/onnxruntime/blob/v1.16.0/onnxruntime/python/tools/quantization/onnx_quantizer.py#L188-L189) `self.weight_qType` and `self.activation_qType` are [integers](https://github.com/microsoft/onnxruntime/blob/v1.16.0/onnxruntime/python/tools/quantization/onnx_quantizer.py#L115-L116) while `ONNXQuantizer` expects `QuantType`

#17651

) In [`quantize_subgraph`](https://github.com/microsoft/onnxruntime/blob/v1.16.0/onnxruntime/python/tools/quantization/onnx_quantizer.py#L188-L189) `self.weight_qType` and `self.activation_qType` are [integers](https://github.com/microsoft/onnxruntime/blob/v1.16.0/onnxruntime/python/tools/quantization/onnx_quantizer.py#L115-L116) while `ONNXQuantizer` expects `QuantType`

Fix onnx quantizer type attribute for subgraph

0d301a4

fxmarty mentioned this pull request Sep 26, 2023

Weirdness when ONNX optimize/exporting and quantizing Llama2 - fails on both, have tried python and CLI approaches huggingface/optimum#1409

Closed

4 tasks

tianleiwu assigned yufenglee Sep 28, 2023

yufenglee approved these changes Sep 28, 2023

View reviewed changes

tianleiwu merged commit 63acaf4 into microsoft:main Oct 1, 2023

tianleiwu added the release:1.16.1 label Oct 4, 2023

snnn pushed a commit that referenced this pull request Oct 5, 2023

Fix onnx quantizer activation and weight type attribute

c3fd281

#17651

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix onnx quantizer activation and weight type attribute #17651

Fix onnx quantizer activation and weight type attribute #17651

echarlaix commented Sep 21, 2023

fxmarty commented Sep 21, 2023

tianleiwu commented Sep 28, 2023

tianleiwu commented Sep 28, 2023

azure-pipelines bot commented Sep 28, 2023

azure-pipelines bot commented Sep 28, 2023

yufenglee left a comment

yufenglee commented Sep 28, 2023

azure-pipelines bot commented Sep 28, 2023

Fix onnx quantizer activation and weight type attribute #17651

Fix onnx quantizer activation and weight type attribute #17651

Conversation

echarlaix commented Sep 21, 2023

fxmarty commented Sep 21, 2023

tianleiwu commented Sep 28, 2023

tianleiwu commented Sep 28, 2023

azure-pipelines bot commented Sep 28, 2023

azure-pipelines bot commented Sep 28, 2023

yufenglee left a comment

Choose a reason for hiding this comment

yufenglee commented Sep 28, 2023

azure-pipelines bot commented Sep 28, 2023