Does NNCF quantize bias parameters to a higher precision? #2080

i3abghany · 2023-08-23T19:49:50Z

i3abghany
Aug 23, 2023

Hello,

In other quantization frameworks, e.g. TFLite(https://arxiv.org/pdf/1712.05877.pdf), Glow (https://github.com/pytorch/glow/blob/a07fbed5da37e6f9bf8b459d15b445067efa3173/docs/Quantization.md?plain=1#L184C28-L184C28), and also in PyTorch (https://discuss.pytorch.org/t/is-bias-quantized-while-doing-pytorch-static-quantization/146416/5), biases are typically either left as FLOAT32 or quantized to higher-precision int datatypes e.g. int32. Does NNCF support a similar thing?

Thank you,
Mahmoud

Answered by alexsu52

Sep 4, 2023

Hello @i3abghany,

I think it is possible, but I would like to understand the user scenario and the benefits of this option in NNCF. Could you share your use case? Perhaps I can make some suggestion.

View full answer

alexsu52 · 2023-09-01T07:27:38Z

alexsu52
Sep 1, 2023
Maintainer

Hello @i3abghany ,

NNCF leaves biases in FP32.

5 replies

i3abghany Sep 2, 2023
Author

Hello @alexsu52 ,

Thanks for the reply! Is there a way to make such high-precision parameters in integer-only data types?

alexsu52 Sep 4, 2023
Maintainer

Hello @i3abghany,

I think it is possible, but I would like to understand the user scenario and the benefits of this option in NNCF. Could you share your use case? Perhaps I can make some suggestion.

Answer selected by i3abghany

i3abghany Sep 4, 2023
Author

Hey @alexsu52,
Thanks for following up! There is no specific use case for me. I am just surveying how general-purpose quantization frameworks quantize biases.

UsingtcNower Aug 18, 2024

Hello @i3abghany,

I think it is possible, but I would like to understand the user scenario and the benefits of this option in NNCF. Could you share your use case? Perhaps I can make some suggestion.

Hi @alexsu52 , I want to convert openvino int8 model to onnx, and needs bias as int32. It could be a scenario.

alexsu52 Aug 30, 2024
Maintainer

Hello @UsingtcNower, NNCF supports for quantization onnx model. If you need to get onnx int8 model, you can use nncf.quantize and feed the onnx fp32 model as input. If you still want to convert openvino int8 model to onnx then you can cast the float bias to int32 and I don't wait significant drop by accuracy because OpenVINO runtime doing almost the same during inference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does NNCF quantize bias parameters to a higher precision? #2080

{{title}}

Replies: 1 comment 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Does NNCF quantize bias parameters to a higher precision? #2080

i3abghany Aug 23, 2023

Replies: 1 comment · 5 replies

alexsu52 Sep 1, 2023 Maintainer

i3abghany Sep 2, 2023 Author

alexsu52 Sep 4, 2023 Maintainer

i3abghany Sep 4, 2023 Author

UsingtcNower Aug 18, 2024

alexsu52 Aug 30, 2024 Maintainer

i3abghany
Aug 23, 2023

Replies: 1 comment 5 replies

alexsu52
Sep 1, 2023
Maintainer

i3abghany Sep 2, 2023
Author

alexsu52 Sep 4, 2023
Maintainer

i3abghany Sep 4, 2023
Author

alexsu52 Aug 30, 2024
Maintainer