quantize_static not quantizing Conv bias to int8 but int32 #17738

ChickenTarm · 2023-09-29T02:50:02Z

Describe the issue

I have quantized my model with these settings

"quant_format": QuantFormat.QDQ,
            "calibrate_method": CalibrationMethod.MinMax,
            "per_channel": True,
            "activation_type": QuantType.QInt8,
            "weight_type": QuantType.QInt8,
            "nodes_to_exclude": nodes_to_exclude[config.MODEL.node.node_function],
            "extra_options": {
                "ActivationSymmetric": True,
                "WeightSymmetric": True,
            },

however for every single convolution, the weights are quantized correctly to int8; bias is not quantized to be int8 but int32 as seen by the screenshots. What would be causing this issue? I suspect this is the issue that causes my conversion to trt engine to fail. It fails when reaching the conv bias node

To reproduce

I run quantize_static on a yolor model.

Urgency

No response

Platform

Linux

OS Version

Ubuntu 22.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16

ONNX Runtime API

Python

Architecture

X86

Execution Provider

Default CPU

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

yufenglee · 2023-09-29T13:16:48Z

This is by design for better performance. For the TRT engine, are you talking about trt itself or trt ep? If trt ep, please take a look at this example here: https://github.com/microsoft/onnxruntime-inference-examples/tree/980def0d59147b6117523cc756c89a10079ff8e4/quantization/object_detection/trt/yolov3. If trt itself, you may need to use their toolchain.

ShuaiShao93 · 2024-04-09T05:51:46Z

I see a similar issue, did you fix this? @ChickenTarm

github-actions bot added the ep:TensorRT issues related to TensorRT execution provider label Sep 29, 2023

HectorSVC added quantization issues related to quantization and removed ep:TensorRT issues related to TensorRT execution provider labels Sep 29, 2023

yufenglee closed this as completed Sep 29, 2023

ChickenTarm mentioned this issue Oct 5, 2023

Failing to convert int8 onnx model to trt engine NVIDIA/TensorRT#3360

Closed

ShuaiShao93 mentioned this issue Apr 9, 2024

torch.onnx.export doesn't correctly constfold constants and DequantLinear pytorch/pytorch#123628

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quantize_static not quantizing Conv bias to int8 but int32 #17738

quantize_static not quantizing Conv bias to int8 but int32 #17738

ChickenTarm commented Sep 29, 2023 •

edited

Loading

yufenglee commented Sep 29, 2023

ShuaiShao93 commented Apr 9, 2024

quantize_static not quantizing Conv bias to int8 but int32 #17738

quantize_static not quantizing Conv bias to int8 but int32 #17738

Comments

ChickenTarm commented Sep 29, 2023 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

yufenglee commented Sep 29, 2023

ShuaiShao93 commented Apr 9, 2024

ChickenTarm commented Sep 29, 2023 •

edited

Loading