Naming conflict when quantizing models with Pad nodes #17760

hokchhaytann · 2023-10-02T17:46:20Z

Describe the issue

The default name of the constant input to the Pad node after quantization is always '_quantized'. Because of this naming conflict, the resulting quantized model has floating QuantizeLinear nodes that are not connected to anything. Also, the Pad ops all share a single input QuantizeLinear. There is also a DequantizeLinear node that leads to nowhere.

To reproduce

Here's the model.
And the script is below:

import subprocess
from os import path
import onnxruntime as ort
import numpy as np

from onnxruntime.quantization import (
    QuantType,
    QuantFormat,
    quantize_static,
    CalibrationMethod,
    CalibrationDataReader
)


class RandomCalibrationDataReader(CalibrationDataReader):
    def __init__(self, input_name : str, input_shape, num_data=32):
        self.enum_data = None
        self.input_shape = input_shape
        self.input_name = input_name
        self.num_data = num_data

    def get_next(self):
        if self.enum_data is None:
            self.enum_data = iter(
                [{self.input_name: np.random.rand(*self.input_shape).astype(np.float32)} for n in range(self.num_data)]
            )
        return next(self.enum_data, None)

    def rewind(self):
        self.enum_data = None


input_model_path = './swin_b.onnx'
optimized_model_path = input_model_path.replace(".onnx", "-infer.onnx")
output_model_path = input_model_path.replace(".onnx", "-QOperator.onnx")
quant_format = QuantFormat.QOperator

ort_session = ort.InferenceSession(input_model_path, providers=["CPUExecutionProvider"])
input_name = ort_session.get_inputs()[0].name
input_shape = ort_session.get_inputs()[0].shape

if not path.exists(optimized_model_path):
    # optimize the model
    subprocess.run(["python", "-m", "onnxruntime.quantization.preprocess",
                    "--input", input_model_path,
                    "--output", optimized_model_path,
                    "--auto_merge"])

# generate random data as calibration data
calib_data_reader = RandomCalibrationDataReader(input_name, input_shape)
quantized_model = quantize_static(model_input=optimized_model_path,
                                model_output=output_model_path,
                                calibration_data_reader=calib_data_reader,
                                quant_format=quant_format,
                                per_channel=True,
                                activation_type=QuantType.QUInt8,
                                weight_type=QuantType.QInt8,
                                calibrate_method=CalibrationMethod.MinMax)

Urgency

No response

Platform

Linux

OS Version

Ubuntu 20.04.6 LTS

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

wschin · 2023-10-02T18:46:36Z

Thanks for reporting this problem. A minimal repro is required to debug this problem. Could you please make the script runnable on our side by simple copy-and-paste? The current repro misses several definitions such as CalibrationDataReader and np if I just run it.

hokchhaytann · 2023-10-02T19:28:40Z

Thanks for your response. I've edited the script above. You just need to change the input path, and it should be runnable.

wschin · 2023-10-05T22:42:27Z

I see a bunch of Pad in the swin_b.onnx with empty string as one input name; e.g., 3rd input of this Pad

input: "/features/features.7/features.7.1/norm1/LayerNormalization_output_0"
input: "/features/features.7/features.7.1/attn/Cast_1_output_0"
input: ""
output: "/features/features.7/features.7.1/attn/Pad_output_0"
name: "/features/features.7/features.7.1/attn/Pad"
op_type: "Pad"
attribute {
  name: "mode"
  type: STRING
  s: "constant"
}

By removing those empty string (e.g., del onnx_node.input[-1]) , I guess quantization will start working. To quickly doing so, you can go to your onnxruntime installation path and find pad.py (e.g., /.../lib/python3.10/site-packages/onnxruntime/quantization/operators/pad.py). Go to around line 33 and add one line to

        if "mode" not in kwargs or kwargs["mode"] == b"constant":
-            if len(node.input) > 2 and node.input[2] != "":  # There is 3rd input 'constant_value'
+            if len(node.input) > 2 and node.input[2] != "":  # There is 3rd input 'constant_value'
                zp_tensor = self.quantizer.model.get_initializer(quantized_input_value.zp_name)
                scale_tensor = self.quantizer.model.get_initializer(quantized_input_value.scale_name)
                if zp_tensor is None or scale_tensor is None:
                    super().quantize()
                    return

Fix #17760. Upstream exporter creates empty string as Pad's 3rd input and the quantization tool 1) considers that as a valid tensor name and 2) adds corresponding invalid quantization nodes. This PR adds a condition check to make quantization tool working.

Fix microsoft#17760. Upstream exporter creates empty string as Pad's 3rd input and the quantization tool 1) considers that as a valid tensor name and 2) adds corresponding invalid quantization nodes. This PR adds a condition check to make quantization tool working.

github-actions bot added the quantization issues related to quantization label Oct 2, 2023

wschin mentioned this issue Oct 5, 2023

Fix Pad's quantization #17807

Merged

wschin self-assigned this Oct 5, 2023

wschin closed this as completed in #17807 Oct 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naming conflict when quantizing models with Pad nodes #17760

Naming conflict when quantizing models with Pad nodes #17760

hokchhaytann commented Oct 2, 2023 •

edited

Loading

wschin commented Oct 2, 2023

hokchhaytann commented Oct 2, 2023 •

edited

Loading

wschin commented Oct 5, 2023 •

edited

Loading

Naming conflict when quantizing models with Pad nodes #17760

Naming conflict when quantizing models with Pad nodes #17760

Comments

hokchhaytann commented Oct 2, 2023 • edited Loading

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

wschin commented Oct 2, 2023

hokchhaytann commented Oct 2, 2023 • edited Loading

wschin commented Oct 5, 2023 • edited Loading

hokchhaytann commented Oct 2, 2023 •

edited

Loading

hokchhaytann commented Oct 2, 2023 •

edited

Loading

wschin commented Oct 5, 2023 •

edited

Loading