Is there any way to retrieve Quantization type and Quantization parameters using onnxruntime ? #19916

OAHLSTM · 2024-03-14T12:17:25Z

Describe the issue

Hello,
I'm trying to get quantization parameters from an input tensor such as the quantization type (Static Linear per tensor/ Static linear per channel/ dynamic) and the associated quantization parameter (scales & zero_points).
In tensorflow-lite, we are able to check if the model is quantized statically per-tensor or per-channel by simply doing:

const TfLiteQuantizationType tflite_qtype = tensor->quantization.type;
switch (tflite_qtype) {
            case TfLiteQuantizationType::kTfLiteAffineQuantization:
            {
                const auto* quantization_params = reinterpret_cast<const TfLiteAffineQuantization*>(tensor->quantization.params);
                if (quant_params->scale && quant_params->scale->size > 1) {
                      // per-channel quantization along the specified dimension
                     uint32_t quant_dim = quantization_params->quantized_dimension;
                     float* scales = quantization_params->scale->data;
                     int32_t* zero_points = quantization_params->zero_point->data;
                     break;
                } else  {
                     float scale = tensor->params.scale;
                     uint32_t zero_point = tensor->params.zero_point;
                }
          }
          case TfLiteQuantizationType::kTfLiteNoQuantization:
          default:
                std::cout << "stai_map_qtype: float or non supported quant type " << std::endl;

I was wondering if there are any ways to do similar quantization parameters retrieving using onnxruntime.
Thank you for your help.

To reproduce

Not applicable

Urgency

This is really urgent since we are migrating from tensorflow-lite to onnxruntime, and this feature is kind of crucial for our implementation.

Platform

Linux

OS Version

Ubuntu 22.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.15.1

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

hariharans29 · 2024-03-14T21:16:05Z

AFAIK our Tensor interface provides no interface to query such metadata. As for can it be ascertained at the model level, tagging @yufenglee as I am not sure about that.

OAHLSTM · 2024-03-21T16:34:00Z

Hello @yufenglee ,
Any update on the topic ?

Thank you for your support,

github-actions · 2024-04-22T15:01:04Z

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

OAHLSTM · 2024-10-28T11:28:36Z

Hello guys,
Is there any update on this topic ? I'm really looking to get the quantizations parameters from my input and output tensors. I have noticed on the release 1.19.2 that the VSINPU and XNNPACK support retrieval of the Quantization parameters through their function:

void GetQuantizationScaleAndZeroPoint(
    const GraphViewer& graph_viewer, const NodeUnitIODef& io_def, const std::filesystem::path& model_path,
    float& scale, int32_t& zero_point, std::optional<std::vector<float>>& pcq_scales,
    std::optional<std::vector<int32_t>>& pcq_zps)

and

std::pair<const onnx::TensorProto*, const onnx::TensorProto*>
GetQuantizationZeroPointAndScale(const GraphViewer& graphview,
                                 const NodeUnitIODef& io_def)

Is there any way to access these parameters loaded by this function from the Tensor Interface ?
Thank you for you help

github-actions bot added the quantization issues related to quantization label Mar 14, 2024

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any way to retrieve Quantization type and Quantization parameters using onnxruntime ? #19916

Is there any way to retrieve Quantization type and Quantization parameters using onnxruntime ? #19916

OAHLSTM commented Mar 14, 2024

hariharans29 commented Mar 14, 2024

OAHLSTM commented Mar 21, 2024

github-actions bot commented Apr 22, 2024

OAHLSTM commented Oct 28, 2024

Is there any way to retrieve Quantization type and Quantization parameters using onnxruntime ? #19916

Is there any way to retrieve Quantization type and Quantization parameters using onnxruntime ? #19916

Comments

OAHLSTM commented Mar 14, 2024

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

hariharans29 commented Mar 14, 2024

OAHLSTM commented Mar 21, 2024

github-actions bot commented Apr 22, 2024

OAHLSTM commented Oct 28, 2024