Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quantize_static not quantizing Conv bias to int8 but int32 #17738

Closed
ChickenTarm opened this issue Sep 29, 2023 · 2 comments
Closed

quantize_static not quantizing Conv bias to int8 but int32 #17738

ChickenTarm opened this issue Sep 29, 2023 · 2 comments
Labels
quantization issues related to quantization

Comments

@ChickenTarm
Copy link

ChickenTarm commented Sep 29, 2023

Describe the issue

I have quantized my model with these settings

"quant_format": QuantFormat.QDQ,
            "calibrate_method": CalibrationMethod.MinMax,
            "per_channel": True,
            "activation_type": QuantType.QInt8,
            "weight_type": QuantType.QInt8,
            "nodes_to_exclude": nodes_to_exclude[config.MODEL.node.node_function],
            "extra_options": {
                "ActivationSymmetric": True,
                "WeightSymmetric": True,
            },

however for every single convolution, the weights are quantized correctly to int8; bias is not quantized to be int8 but int32 as seen by the screenshots. What would be causing this issue? I suspect this is the issue that causes my conversion to trt engine to fail. It fails when reaching the conv bias node

Screenshot 2023-09-28 at 10 42 48 PM
Screenshot 2023-09-28 at 10 43 05 PM

To reproduce

I run quantize_static on a yolor model.

Urgency

No response

Platform

Linux

OS Version

Ubuntu 22.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16

ONNX Runtime API

Python

Architecture

X86

Execution Provider

Default CPU

Execution Provider Library Version

No response

@github-actions github-actions bot added the ep:TensorRT issues related to TensorRT execution provider label Sep 29, 2023
@HectorSVC HectorSVC added quantization issues related to quantization and removed ep:TensorRT issues related to TensorRT execution provider labels Sep 29, 2023
@yufenglee
Copy link
Member

This is by design for better performance. For the TRT engine, are you talking about trt itself or trt ep? If trt ep, please take a look at this example here: https://github.com/microsoft/onnxruntime-inference-examples/tree/980def0d59147b6117523cc756c89a10079ff8e4/quantization/object_detection/trt/yolov3. If trt itself, you may need to use their toolchain.

@ShuaiShao93
Copy link

I see a similar issue, did you fix this? @ChickenTarm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
quantization issues related to quantization
Projects
None yet
Development

No branches or pull requests

4 participants