Fix QLinearConv promotion

Currently, QDQ to QLinearConv promotion is happenning incorrectly for per-axis quantization where axis != 0. QLinearConv can only support per-output-channel quantization, which is equivalent to per-axis quantization with axis=1. The issue can be easily seen when running this model with optimization enable (promoted to QLinearConv) and disabled (using QDQ). Both produce different results.
microsoft · Aug 12, 2024 · 88a985b · 88a985b
1 parent c5592fd
commit 88a985b
Showing 1 changed file with 7 additions and 0 deletions.
diff --git a/onnxruntime/core/optimizer/qdq_transformer/selectors_actions/qdq_selectors.cc b/onnxruntime/core/optimizer/qdq_transformer/selectors_actions/qdq_selectors.cc
@@ -370,6 +370,13 @@ bool ConvNodeGroupSelector::Check(const GraphViewer& graph_viewer,
     return false;
   }
 
+  // Only per-tensor or per-output channel (axis == 1) quantization is supported
+  const auto& dq_attrs = dq_nodes[1]->GetAttributes();
+  if (const auto a_iter = dq_attrs.find("axis");
+      a_iter == dq_attrs.end() || a_iter->second.i() != 1) {
+    return false;
+  }
+
   return true;
 }