Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DML] MatrixMultiplyIntegerToFloat #19608

Merged
merged 10 commits into from
Mar 4, 2024
Merged

Conversation

raoanag
Copy link
Contributor

@raoanag raoanag commented Feb 22, 2024

Description

DML Implementation for com.microsoft.MatMulIntegerToFloat

.\onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat.*"
Note: Google Test filter = *MatMulIntegerToFloat.*
[==========] Running 22 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 22 tests from MatMulIntegerToFloat
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (620 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (497 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 (488 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8 (503 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8 (495 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8 (488 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8 (492 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8 (502 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8 (452 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8 (454 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8 (446 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8 (508 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8 (456 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8 (455 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 (447 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8 (465 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8 (111 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8 (115 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8 (114 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8 (110 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16 (112 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint
[       OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (337 ms)
[----------] 22 tests from MatMulIntegerToFloat (8679 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 1 test suite ran. (8680 ms total)
[  PASSED  ] 22 tests.
memleakdbg:
----- No memory leaks detected -----

Motivation and Context

  • CalculateMatMulIntegerToFloat to replace CPU EP run reference
  • Added more FP32 testcases to isolate all input datatype combinations
  • Added fixed input to MatMulIntegerToFloat_FP16* test cases as for FP16 test cases.
  • onnxruntime/test/testdata/matmul_integer_to_float.py` is capable of generating FP16 models, but we do not produce any for now

@raoanag raoanag force-pushed the user/anarao/DMLEPMatMulInt2Flt branch 5 times, most recently from ce6aa75 to 3cec30a Compare February 23, 2024 19:10
@raoanag raoanag marked this pull request as ready for review February 23, 2024 20:09
@raoanag raoanag requested a review from a team as a code owner February 23, 2024 20:09
@raoanag raoanag requested review from yufenglee and tbqh February 23, 2024 20:09
raoanag and others added 6 commits February 27, 2024 12:11
[Cherry Pick Reviewed]
Commit all MatrixMultiplyIntegerToFloat PRs
[MatrixMultiplyIntegerToFloat
(](https://github.com/microsoft/onnxruntime/pull/18275/commits/bf642a4d35691a13ff0ecef11cb8a9571c5a5610)https://github.com/microsoft/onnxruntime/pull/16804[)]
[MatMulIntToFloat Enable FP16 and update tensor ORT-DML indexing
(](https://github.com/microsoft/onnxruntime/pull/18275/commits/8237548d14f11a165a9b82bf181f8762e65f6142)https://github.com/microsoft/onnxruntime/pull/16871[)]
[Disable MatMulIntegerToFloat transformation for FP16 on CPU EP
(](https://github.com/microsoft/onnxruntime/pull/18275/commits/b16bf809dea31872ccb664f2622711966078e3f5)https://github.com/microsoft/onnxruntime/pull/18239[)]

<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description

MatMulIntegerToFloat tests were noticed to be failing for DMLEP the root
cause being inaccuracies in CPUEP implementation to some data type
combinations.

```
.\onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat.*"
Note: Google Test filter = *MatMulIntegerToFloat.*
[==========] Running 22 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 22 tests from MatMulIntegerToFloat
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (620 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (497 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 (488 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8 (503 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8 (495 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8 (488 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8 (492 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8 (502 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8 (452 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8 (454 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8 (446 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8 (508 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8 (456 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8 (455 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 (447 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8 (465 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8 (111 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8 (115 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8 (114 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8 (110 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16 (112 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint
[       OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (337 ms)
[----------] 22 tests from MatMulIntegerToFloat (8679 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 1 test suite ran. (8680 ms total)
[  PASSED  ] 22 tests.
memleakdbg:
----- No memory leaks detected -----
```

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
* `CalculateMatMulIntegerToFloat` to replace CPU EP run reference
* Added more FP32 testcases to isolate all input datatype combinations
* Added fixed input to `MatMulIntegerToFloat_FP16*` test cases as for
FP16 test cases. There is no support for direct onnxruntime::MLFloat16
datatype comparison with gtest framework. This leads to FP32 reference
-> FP16 tensor -> FP32 reference conversion which is adding
inaccuracies.

![image](https://github.com/microsoft/onnxruntime/assets/127366241/c6aaf68e-44df-42be-9860-df2cb0dd7a56)
* Removing `MatMulIntegerToFloatHelper` as its same as `MatMulHelper`
* onnxruntime/test/testdata/matmul_integer_to_float.py` is still capable
of generating FP16 models, but we do not produce any for now
@raoanag raoanag force-pushed the user/anarao/DMLEPMatMulInt2Flt branch from 139236a to ddf8f78 Compare February 27, 2024 21:28
@raoanag raoanag force-pushed the user/anarao/DMLEPMatMulInt2Flt branch from ddf8f78 to 6fe223c Compare February 28, 2024 00:41
tbqh
tbqh previously approved these changes Feb 28, 2024
yufenglee
yufenglee previously approved these changes Feb 29, 2024
@raoanag
Copy link
Contributor Author

raoanag commented Mar 1, 2024

/azp run Big Models, Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@raoanag
Copy link
Contributor Author

raoanag commented Mar 1, 2024

/azp run Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows x64 QNN CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

Copy link

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

@raoanag
Copy link
Contributor Author

raoanag commented Mar 1, 2024

/azp run Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline

Copy link

Azure Pipelines successfully started running 9 pipeline(s).

@raoanag
Copy link
Contributor Author

raoanag commented Mar 1, 2024

/azp run Windows x64 QNN CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@raoanag raoanag dismissed stale reviews from yufenglee and tbqh via a4c8158 March 1, 2024 20:17
@raoanag raoanag merged commit 27b1dc9 into main Mar 4, 2024
93 of 95 checks passed
@raoanag raoanag deleted the user/anarao/DMLEPMatMulInt2Flt branch March 4, 2024 19:55
@tianleiwu
Copy link
Contributor

tianleiwu commented Mar 5, 2024

@raoanag, please adjust the test threshold (always use abs error and not relative error?). Here is an example test failure:
NoZeroPoint_HasBias_test_U8S8
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272
The difference between cur_expected[i] and cur_actual[i] is 6.6161155700683594e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 0.00020684301853179932,
cur_actual[i] evaluates to 0.00020022690296173096, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 4.1368602978764102e-06.
i:211
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1183722559

@fs-eire
Copy link
Contributor

fs-eire commented Mar 6, 2024

WebAssembly CI also fails on test "MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8".

build log link

[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272: Failure
The difference between cur_expected[i] and cur_actual[i] is 1.9073486328125e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 1.9073486328125e-06,
cur_actual[i] evaluates to 0, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 3.8146971803598717e-08.
i:497
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1298328783

/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272: Failure
The difference between cur_expected[i] and cur_actual[i] is 1.9073486328125e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 1.9073486328125e-06,
cur_actual[i] evaluates to 0, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 3.8146971803598717e-08.
i:497
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1298328783

/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272: Failure
The difference between cur_expected[i] and cur_actual[i] is 1.3113021850585938e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 1.3113021850585938e-06,
cur_actual[i] evaluates to -0, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 2.6226043559063328e-08.
i:497
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1298328783

/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:272: Failure
The difference between cur_expected[i] and cur_actual[i] is 1.3113021850585938e-06, which exceeds *(params.relative_error) * std::abs(cur_expected[i]), where
cur_expected[i] evaluates to 1.3113021850585938e-06,
cur_actual[i] evaluates to -0, and
*(params.relative_error) * std::abs(cur_expected[i]) evaluates to 2.6226043559063328e-08.
i:497
Google Test trace:
/mnt/vss/_work/1/s/onnxruntime/test/providers/checkers.cc:484: provider type: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/providers/base_tester.cc:791: registered execution providers: CPUExecutionProvider
/mnt/vss/_work/1/s/onnxruntime/test/common/random_generator.h:49: ORT test random seed: 1298328783

[  FAILED  ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 (7 ms)

tianleiwu added a commit that referenced this pull request Mar 7, 2024
### Description

Check float/double/float16/bfloat16 tensors are close like
[numpy.isclose](https://numpy.org/doc/stable/reference/generated/numpy.isclose.html).
```
absolute(a - b) <= (atol + rtol * absolute(b))
```

The default tolerance thresholds:
- float: atol=1e-5 and rtol=1e-4
- float16: atol=0.0025 and rtol=0.001
- bfloat16: atol=0.02 and rtol=0.01

### Motivation and Context

Current pipeline has frequent failure due to using only relative
tolerance in #19608:

[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8
1: C:\a\_work\1\s\onnxruntime\test\providers\checkers.cc(272): error:
The difference between cur_expected[i] and cur_actual[i] is
1.3113021850585938e-06, which exceeds *(params.relative_error) *
std::abs(cur_expected[i]), where
1: cur_expected[i] evaluates to -1.3113021850585938e-06,
1: cur_actual[i] evaluates to 0, and
1: *(params.relative_error) * std::abs(cur_expected[i]) evaluates to
2.6226043559063328e-08.

It is not reasonable to use relative tolerance for a small value very
close to 0. Combining relative tolerance with a positive absolute
tolerance could avoid such issue.
zz002 pushed a commit to zz002/onnxruntime that referenced this pull request Mar 7, 2024
### Description
DML Implementation for
[com.microsoft.MatMulIntegerToFloat](https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#com.microsoft.MatMulIntegerToFloat)

```
.\onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat.*"
Note: Google Test filter = *MatMulIntegerToFloat.*
[==========] Running 22 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 22 tests from MatMulIntegerToFloat
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (620 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (497 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8S8 (488 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8S8 (503 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8U8 (495 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8U8 (488 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8U8 (492 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8X8 (502 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8U8 (452 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8U8 (454 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_S8U8 (446 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_S8U8 (508 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8S8 (456 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8S8 (455 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8 (447 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_HasBias_test_U8S8 (465 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8U8 (111 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_U8S8 (115 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8S8 (114 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16_S8U8 (110 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16
[       OK ] MatMulIntegerToFloat.MatMulIntegerToFloat_FP16 (112 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint
[       OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (337 ms)
[----------] 22 tests from MatMulIntegerToFloat (8679 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 1 test suite ran. (8680 ms total)
[  PASSED  ] 22 tests.
memleakdbg:
----- No memory leaks detected -----
```


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
* `CalculateMatMulIntegerToFloat` to replace CPU EP run reference
* Added more FP32 testcases to isolate all input datatype combinations 
* Added fixed input to `MatMulIntegerToFloat_FP16*` test cases as for
FP16 test cases.
* onnxruntime/test/testdata/matmul_integer_to_float.py` is capable of
generating FP16 models, but we do not produce any for now
zz002 pushed a commit to zz002/onnxruntime that referenced this pull request Mar 7, 2024
### Description

Check float/double/float16/bfloat16 tensors are close like
[numpy.isclose](https://numpy.org/doc/stable/reference/generated/numpy.isclose.html).
```
absolute(a - b) <= (atol + rtol * absolute(b))
```

The default tolerance thresholds:
- float: atol=1e-5 and rtol=1e-4
- float16: atol=0.0025 and rtol=0.001
- bfloat16: atol=0.02 and rtol=0.01

### Motivation and Context

Current pipeline has frequent failure due to using only relative
tolerance in microsoft#19608:

[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_NoBias_test_U8S8
1: C:\a\_work\1\s\onnxruntime\test\providers\checkers.cc(272): error:
The difference between cur_expected[i] and cur_actual[i] is
1.3113021850585938e-06, which exceeds *(params.relative_error) *
std::abs(cur_expected[i]), where
1: cur_expected[i] evaluates to -1.3113021850585938e-06,
1: cur_actual[i] evaluates to 0, and
1: *(params.relative_error) * std::abs(cur_expected[i]) evaluates to
2.6226043559063328e-08.

It is not reasonable to use relative tolerance for a small value very
close to 0. Combining relative tolerance with a positive absolute
tolerance could avoid such issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants