From cadd6b236e093f910c9e7cea623c81846cab3506 Mon Sep 17 00:00:00 2001
From: Yaman Umuroglu <yaman.umuroglu@amd.com>
Date: Mon, 23 Oct 2023 23:16:31 +0200
Subject: [PATCH] [Spec] clarifications to Quant op spec

* scale, zeropt can be either scalar or tensor with matching number of dimensions for e.g. channel-wise quantization.
* bitwidth may be specified as float32 for convenience, but must still represent a positive integer.
---
 docs/qonnx-custom-ops/quant_op.md | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/docs/qonnx-custom-ops/quant_op.md b/docs/qonnx-custom-ops/quant_op.md
index 003be341..02d115fb 100644
--- a/docs/qonnx-custom-ops/quant_op.md
+++ b/docs/qonnx-custom-ops/quant_op.md
@@ -1,7 +1,9 @@
 ### <a name="Quant"></a><a name="abs">**Quant**</a>
 
 Calculates the quantized values of one input data (Tensor<T>) and produces one output data (Tensor<T>).
-Additionally, takes three floats as input, which define the scale, zero-point and bit-width of the quantization.
+Additionally, takes three floats as input, which define the scale, zero-point and bit-width of the quantization,
+which may be scalars or tensors with number of dimensions equal to the input data tensor, for e.g. tensor-wise
+or channel-wise quantization.
 The attributes narrow and signed define how the bits of the quantization are interpreted, while the attribute
 rounding_mode defines how quantized values are rounded.
 
@@ -27,12 +29,12 @@ This operator is not part of the ONNX standard and is not currently versioned.
 <dl>
 <dt><tt>X</tt> (differentiable) : tensor(float32)</dt>
 <dd>input tensor to quantize</dd>
-<dt><tt>scale</tt> : float32</dt>
-<dd>The scale factor</dd>
-<dt><tt>zeropt</tt> : float32</dt>
-<dd>The zero-point</dd>
-<dt><tt>bitwidth</tt> : int32</dt>
-<dd>The number of bits used by the quantization</dd>
+<dt><tt>scale</tt> : float32, tensor(float32)</dt>
+<dd>The scale factor, either as a global scalar or with a shape matching the number of dimensions of the X tensor</dd>
+<dt><tt>zeropt</tt> : float32, tensor(float32) </dt>
+<dd>The zero-point, either as a global scalar or with a shape matching the number of dimensions of the X tensor</dd>
+<dt><tt>bitwidth</tt> : int32, float32</dt>
+<dd>The number of bits used by the quantization, must be a positive integer. If float32 dtype is used for convenience, it must still represent an positive integer number of bits.</dd>
 </dl>