[Bug] fp8 quantization, weight_error is empty, but qdq_err is normal #21113

zccyman · 2024-06-20T06:54:34Z

Describe:

        weight_type=QuantType.QFLOAT8E4M3FN,
        activation_type=QuantType.QFLOAT8E4M3FN,
        calibrate_method=CalibrationMethod.Distribution,
        extra_options=dict(
            QDQKeepRemovableActivations=True,
            ActivationSymmetric=False,
            WeightSymmetric=False,
            AddQDQPairToWeight=True,
            QuantizeBias=True,
            ForceQuantizeNoInputCheck=True,
            TensorQuantOverrides=dict(),
        ),

weight errors is empty, but qdq_err is normal.

Comparing weights of float model vs qdq model.....
weight errors: 

------------------------------------------------

Augmenting models to save intermediate activations......
------------------------------------------------

Running the augmented floating point model to collect activations......
------------------------------------------------

Running the augmented qdq model to collect activations......
2024-06-20 06:53:28.508970448 [W:onnxruntime:, graph.cc:4093 CleanUnusedInitializersAndNodeArgs] Removing initializer 'conv1.bias_quantized_zero_point'. It is not used by any node and should be removed from the model.
2024-06-20 06:53:28.508988421 [W:onnxruntime:, graph.cc:4093 CleanUnusedInitializersAndNodeArgs] Removing initializer '/conv1/Conv_output_0_scale'. It is not used by any node and should be removed from the model.
2024-06-20 06:53:28.508992868 [W:onnxruntime:, graph.cc:4093 CleanUnusedInitializersAndNodeArgs] Removing initializer '/conv1/Conv_output_0_zero_point'. It is not used by any node and should be removed from the model.
2024-06-20 06:53:28.508997639 [W:onnxruntime:, graph.cc:4093 CleanUnusedInitializersAndNodeArgs] Removing initializer 'conv1.bias_quantized_scale'. It is not used by any node and should be removed from the model.
------------------------------------------------

Comparing activations of float model vs qdq model......
qdq_err: 

output 0.0839491831621686
input 0.036262167281220226
xmodel_err: 

output 0.15343184633604956
input 0.036262167281220226

The text was updated successfully, but these errors were encountered:

github-actions · 2024-07-20T15:00:44Z

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

github-actions bot added the quantization issues related to quantization label Jun 20, 2024

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Jul 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] fp8 quantization, weight_error is empty, but qdq_err is normal #21113

[Bug] fp8 quantization, weight_error is empty, but qdq_err is normal #21113

zccyman commented Jun 20, 2024 •

edited

Loading

github-actions bot commented Jul 20, 2024

[Bug] fp8 quantization, weight_error is empty, but qdq_err is normal #21113

[Bug] fp8 quantization, weight_error is empty, but qdq_err is normal #21113

Comments

zccyman commented Jun 20, 2024 • edited Loading

github-actions bot commented Jul 20, 2024

zccyman commented Jun 20, 2024 •

edited

Loading