Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming conflict when quantizing models with Pad nodes #17760

Closed
hokchhaytann opened this issue Oct 2, 2023 · 3 comments · Fixed by #17807
Closed

Naming conflict when quantizing models with Pad nodes #17760

hokchhaytann opened this issue Oct 2, 2023 · 3 comments · Fixed by #17807
Assignees
Labels
quantization issues related to quantization

Comments

@hokchhaytann
Copy link

hokchhaytann commented Oct 2, 2023

Describe the issue

The default name of the constant input to the Pad node after quantization is always '_quantized'. Because of this naming conflict, the resulting quantized model has floating QuantizeLinear nodes that are not connected to anything. Also, the Pad ops all share a single input QuantizeLinear. There is also a DequantizeLinear node that leads to nowhere.

Screenshot 2023-10-02 at 12 43 29 PM Screenshot 2023-10-02 at 12 44 41 PM

To reproduce

Here's the model.
And the script is below:

import subprocess
from os import path
import onnxruntime as ort
import numpy as np

from onnxruntime.quantization import (
    QuantType,
    QuantFormat,
    quantize_static,
    CalibrationMethod,
    CalibrationDataReader
)


class RandomCalibrationDataReader(CalibrationDataReader):
    def __init__(self, input_name : str, input_shape, num_data=32):
        self.enum_data = None
        self.input_shape = input_shape
        self.input_name = input_name
        self.num_data = num_data

    def get_next(self):
        if self.enum_data is None:
            self.enum_data = iter(
                [{self.input_name: np.random.rand(*self.input_shape).astype(np.float32)} for n in range(self.num_data)]
            )
        return next(self.enum_data, None)

    def rewind(self):
        self.enum_data = None


input_model_path = './swin_b.onnx'
optimized_model_path = input_model_path.replace(".onnx", "-infer.onnx")
output_model_path = input_model_path.replace(".onnx", "-QOperator.onnx")
quant_format = QuantFormat.QOperator

ort_session = ort.InferenceSession(input_model_path, providers=["CPUExecutionProvider"])
input_name = ort_session.get_inputs()[0].name
input_shape = ort_session.get_inputs()[0].shape

if not path.exists(optimized_model_path):
    # optimize the model
    subprocess.run(["python", "-m", "onnxruntime.quantization.preprocess",
                    "--input", input_model_path,
                    "--output", optimized_model_path,
                    "--auto_merge"])

# generate random data as calibration data
calib_data_reader = RandomCalibrationDataReader(input_name, input_shape)
quantized_model = quantize_static(model_input=optimized_model_path,
                                model_output=output_model_path,
                                calibration_data_reader=calib_data_reader,
                                quant_format=quant_format,
                                per_channel=True,
                                activation_type=QuantType.QUInt8,
                                weight_type=QuantType.QInt8,
                                calibrate_method=CalibrationMethod.MinMax)

Urgency

No response

Platform

Linux

OS Version

Ubuntu 20.04.6 LTS

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@github-actions github-actions bot added the quantization issues related to quantization label Oct 2, 2023
@wschin
Copy link
Contributor

wschin commented Oct 2, 2023

Thanks for reporting this problem. A minimal repro is required to debug this problem. Could you please make the script runnable on our side by simple copy-and-paste? The current repro misses several definitions such as CalibrationDataReader and np if I just run it.

@hokchhaytann
Copy link
Author

hokchhaytann commented Oct 2, 2023

Thanks for your response. I've edited the script above. You just need to change the input path, and it should be runnable.

@wschin
Copy link
Contributor

wschin commented Oct 5, 2023

I see a bunch of Pad in the swin_b.onnx with empty string as one input name; e.g., 3rd input of this Pad

input: "/features/features.7/features.7.1/norm1/LayerNormalization_output_0"
input: "/features/features.7/features.7.1/attn/Cast_1_output_0"
input: ""
output: "/features/features.7/features.7.1/attn/Pad_output_0"
name: "/features/features.7/features.7.1/attn/Pad"
op_type: "Pad"
attribute {
  name: "mode"
  type: STRING
  s: "constant"
}

By removing those empty string (e.g., del onnx_node.input[-1]) , I guess quantization will start working. To quickly doing so, you can go to your onnxruntime installation path and find pad.py (e.g., /.../lib/python3.10/site-packages/onnxruntime/quantization/operators/pad.py). Go to around line 33 and add one line to

        if "mode" not in kwargs or kwargs["mode"] == b"constant":
-            if len(node.input) > 2 and node.input[2] != "":  # There is 3rd input 'constant_value'
+            if len(node.input) > 2 and node.input[2] != "":  # There is 3rd input 'constant_value'
                zp_tensor = self.quantizer.model.get_initializer(quantized_input_value.zp_name)
                scale_tensor = self.quantizer.model.get_initializer(quantized_input_value.scale_name)
                if zp_tensor is None or scale_tensor is None:
                    super().quantize()
                    return

@wschin wschin self-assigned this Oct 5, 2023
wschin added a commit that referenced this issue Oct 9, 2023
Fix #17760. Upstream exporter creates empty string as Pad's 3rd input
and the quantization tool 1) considers that as a valid tensor name and
2) adds corresponding invalid quantization nodes. This PR adds a
condition check to make quantization tool working.
kleiti pushed a commit to kleiti/onnxruntime that referenced this issue Mar 22, 2024
Fix microsoft#17760. Upstream exporter creates empty string as Pad's 3rd input
and the quantization tool 1) considers that as a valid tensor name and
2) adds corresponding invalid quantization nodes. This PR adds a
condition check to make quantization tool working.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
quantization issues related to quantization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants