You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current proposal has support for quantized types like tensor-quant8-asymm and some operators support them. Many networks run in mixed precision i.e. quantized output matrix multiply followed by logsoftmax in float32.
(4 years later 😲) quantize and dequantizeLinear are proposed here: #375 (comment). There's still some thought needed for the block size, given DequantizeLinear-21's new attribute (whether to do something similar/different/more generic/more limited...), but it has momentum.
The current proposal has support for quantized types like
tensor-quant8-asymm
and some operators support them. Many networks run in mixed precision i.e. quantized output matrix multiply followed by logsoftmax in float32.Propose adding https://github.com/onnx/onnx/blob/master/docs/Operators.md#DequantizeLinear and https://github.com/onnx/onnx/blob/master/docs/Operators.md#QuantizeLinear to make the quantized operators actually usable for many models.
The text was updated successfully, but these errors were encountered: