Skip to content

Commit

Permalink
Partially disable QGemm tests for float 8 types (#18196)
Browse files Browse the repository at this point in the history
### Description
The quantization tool assumes QGemm is implemented for float 8 types but
it is not yet supported. The condition partially disabling the test was
not robust enough. This is changed by this PR.
  • Loading branch information
xadupre authored Nov 3, 2023
1 parent 9f5a685 commit 1439da3
Showing 1 changed file with 3 additions and 18 deletions.
21 changes: 3 additions & 18 deletions onnxruntime/test/python/quantization/test_op_gemm.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,24 +192,9 @@ def static_quant_test(
check_qtype_by_node_type(self, model_int8_path, qnode_io_qtypes)
data_reader.rewind()
if activation_type_str == "f8e4m3fn" and weight_type_str == "f8e4m3fn":
# QGemm is not implemented for CPU.
try:
check_model_correctness(
self,
model_fp32_path,
model_int8_path,
data_reader.get_next(),
providers=["CUDAExecutionProvider", "CPUExecutionProvider"],
is_gemm=True,
)
except Exception as e:
if (
"Type 'tensor(float8e4m3fn)' of input parameter (input_quantized) of operator (QGemm) in node () is invalid."
in str(e)
):
warnings.warn("Fix this test when QGemm is implemented.")
return
raise e
# QGemm for float 8 is not implemented. The test should be updated when it is.
warnings.warn("Fix this test when QGemm is implemented for float 8 types.")
return
else:
check_model_correctness(self, model_fp32_path, model_int8_path, data_reader.get_next(), is_gemm=True)

Expand Down

0 comments on commit 1439da3

Please sign in to comment.