[Migraphx EP] Static int8 QDQ support #17931

TedThemistokleous · 2023-10-13T01:58:29Z

Description

Adding static int8 quantization support for MIGraphX Execution Provider

Allows for parsing in calibration tables generated by Onnxruntime or TensorRT's toolsets
Add proper environment variables into the MIGraphX EP
Update python API to include updating execution provider flags -> was missing on python side
Hook into MIGraphX's int8 quantitation and optimization of models

Motivation and Context

Required so that we can get onnxruntime to pass in models while leveraging the existing tooling for int8 static QDQ quantization.

First step in a series of PRs which will add further static quantization on the operator level as MIGraphX releases further support.

These changes drew heavily from the tensorRT EP should allow for similar functionality for GPU based (versus CPU) quantization of models before an inference is performed.

TedThemistokleous · 2023-10-13T02:00:44Z

@cloudhan @PeixuanZuo

Keeping this in draft until I can get some models quantized and run through your end to end examples found here: https://github.com/microsoft/onnxruntime-inference-examples/tree/main/quantization

Let me know if you have any further desired pieces I should be adding/taking into consideration.

I shall be uploading the finished end to end examples to the onnxruntime-inference-example repo as well once completed.

TedThemistokleous · 2023-10-17T22:49:59Z

@cloudhan @PeixuanZuo can you kick off CI for me? We're looking to get support of this out with our release.

I've confirmed functionality with the tagged end to end example of resnet50.

onnxruntime/python/tools/transformers/benchmark.py

TedThemistokleous · 2023-10-18T19:33:10Z

@cloudhan can you enable CI again?

onnxruntime/python/tools/transformers/benchmark.py

cloudhan · 2023-10-19T03:35:08Z

@TedThemistokleous FYI, for quick local development iteration to test the CI, You can grab the CI command from other PRs
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1173339&view=logs&j=7536d2cd-87d4-54fe-4891-bfbbf2741d83&t=66420422-c7d6-5f71-625c-4b7851c9b9ba&l=30. Modify it and make it suitable for local test.

TedThemistokleous · 2023-10-19T15:13:53Z

@TedThemistokleous FYI, for quick local development iteration to test the CI, You can grab the CI command from other PRs https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1173339&view=logs&j=7536d2cd-87d4-54fe-4891-bfbbf2741d83&t=66420422-c7d6-5f71-625c-4b7851c9b9ba&l=30. Modify it and make it suitable for local test.

python tools/ci_build/build.py --config Release --enable_training --cmake_extra_defines CMAKE_HIP_COMPILER=/opt/rocm/llvm/bin/clang++ onnxruntime_BUILD_KERNEL_EXPLORER=OFF onnxruntime_USE_COMPOSABLE_KERNEL=OFF --mpi_home /opt/ompi --use_migraphx --rocm_version=5.7 --rocm_home /opt/rocm --nccl_home /opt/rocm --update --build_dir /build --build --parallel --build_wheel --skip_submodule_sync --use_cache --skip_tests --cmake_path /usr/bin/cmake --ctest_path /usr/bin/ctest

Does that also run local linting? That's what seems to be failing here. The training error I'm not seeing. We're currently using the following for builds

./build.sh --config Release --cmake_extra_defines CMAKE_HIP_COMPILER=/opt/rocm/llvm/bin/clang++ --update --build --build_wheel --parallel --cmake_extra_defines ONNXRUNTIME_VERSION=$(cat ./VERSION_NUMBER) --skip_tests --rocm_home /opt/rocm --use_migraphx --migraphx_home /opt/rocm --rocm_version=`cat /opt/rocm/.info/version-dev` --allow_running_as_root

TedThemistokleous · 2023-10-19T17:24:33Z

@cloudhan @ytaous need another kick for CI

onnxruntime/python/tools/transformers/benchmark.py

cloudhan · 2023-10-20T15:40:48Z

/azp run Linux MIGraphX CI Pipeline,orttraining-amd-gpu-ci-pipeline

azure-pipelines · 2023-10-20T15:41:04Z

Azure Pipelines successfully started running 2 pipeline(s).

TedThemistokleous · 2023-10-20T20:59:18Z

@cloudhan there's nonsensical answers I'm getting from your linter:

810  814 |                           << ". Please reference
    810      |-
https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements
    811      |-to ensure all dependencies are met.";
         815 |+                             https :  // onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements
         816 |+to ensure all dependencies are met.";

Running lintrunner on onnxruntime root dir gives me the following

Warning: Could not find a lintrunner config at: '.lintrunner.private.toml'. Continuing without using configuration file.
WARNING: No previous init data found. If this is the first time you're running lintrunner, you should run `lintrunner init`.
  RUFF failure
  BLACK-ISORT failure
  PYLINT success!
  RUSTFMT success!
  CLANGFORMAT failure


>>> General linter failure:

  Error (RUFF) Linter failed
    Linter failed. This a bug, please file an issue against the linter
    maintainer.
    
    CONTEXT:
    Failed to execute linter command python with args: ["-m",
    "lintrunner_adapters", "run", "ruff_linter", "--config=pyproject.toml",
    "@/tmp/.tmptFQL1X"]
  Error (CLANGFORMAT) Linter failed
    Linter failed. This a bug, please file an issue against the linter
    maintainer.
    
    CONTEXT:
    Failed to execute linter command python with args: ["-m",
    "lintrunner_adapters", "run", "clangformat_linter", "--binary=clang-
    format", "--fallback", "--", "@/tmp/.tmp2KvuTb"]
  Error (BLACK-ISORT) Linter failed
    Linter failed. This a bug, please file an issue against the linter
    maintainer.
    
    CONTEXT:
    Failed to execute linter command python with args: ["-m",
    "lintrunner_adapters", "run", "black_isort_linter", "--", "@/
    tmp/.tmp2PxX1i"]

cloudhan · 2023-10-23T03:48:10Z

@TedThemistokleous The ci log is weird because it stripped all whitespace changes from the log, so the log itself make no sense at all. All you need is just run clang-format on that file and it should be automatically formatted.

For lintrunner, you also need lintrunner_adapters.

pip install lintrunner lintrunner_adapters and lintrunner <the file> instead of the whole repo will be more helpful.

cloudhan · 2023-10-23T03:53:06Z

For example lintrunner include/onnxruntime/core/session/onnxruntime_c_api.h

lintrunner -a include/onnxruntime/core/session/onnxruntime_c_api.h will automatically fix it for you:

> git diff                                                                                                                                                                                               (base) 
diff --git a/include/onnxruntime/core/session/onnxruntime_c_api.h b/include/onnxruntime/core/session/onnxruntime_c_api.h
index c44b941c13..8f22f2f75e 100644
--- a/include/onnxruntime/core/session/onnxruntime_c_api.h
+++ b/include/onnxruntime/core/session/onnxruntime_c_api.h
@@ -598,12 +598,11 @@ typedef struct OrtTensorRTProviderOptions {
  * \see OrtApi::SessionOptionsAppendExecutionProvider_MIGraphX
  */
 typedef struct OrtMIGraphXProviderOptions {
-
-  int device_id;             // hip device id.
-  int migraphx_fp16_enable;  // enable MIGraphX FP16 precision. Default 0 = false, nonzero = true
-  int migraphx_int8_enable;  // enable MIGraphX INT8 precision. Default 0 = false, nonzero = true
-  int migraphx_use_native_calibration_table;          // MIGraphx INT8 cal table. Default 0 = false, noznero = true
-  const char* migraphx_int8_calibration_table_name;   // MIGraphx INT8 calibration table name
+  int device_id;                                     // hip device id.
+  int migraphx_fp16_enable;                          // enable MIGraphX FP16 precision. Default 0 = false, nonzero = true
+  int migraphx_int8_enable;                          // enable MIGraphX INT8 precision. Default 0 = false, nonzero = true
+  int migraphx_use_native_calibration_table;         // MIGraphx INT8 cal table. Default 0 = false, noznero = true
+  const char* migraphx_int8_calibration_table_name;  // MIGraphx INT8 calibration table name
 } OrtMIGraphXProviderOptions;
 
 /** \brief OpenVINO Provider Options

TedThemistokleous · 2023-10-23T19:47:12Z

For example lintrunner include/onnxruntime/core/session/onnxruntime_c_api.h

lintrunner -a include/onnxruntime/core/session/onnxruntime_c_api.h will automatically fix it for you:

> git diff                                                                                                                                                                                               (base) 
diff --git a/include/onnxruntime/core/session/onnxruntime_c_api.h b/include/onnxruntime/core/session/onnxruntime_c_api.h
index c44b941c13..8f22f2f75e 100644
--- a/include/onnxruntime/core/session/onnxruntime_c_api.h
+++ b/include/onnxruntime/core/session/onnxruntime_c_api.h
@@ -598,12 +598,11 @@ typedef struct OrtTensorRTProviderOptions {
  * \see OrtApi::SessionOptionsAppendExecutionProvider_MIGraphX
  */
 typedef struct OrtMIGraphXProviderOptions {
-
-  int device_id;             // hip device id.
-  int migraphx_fp16_enable;  // enable MIGraphX FP16 precision. Default 0 = false, nonzero = true
-  int migraphx_int8_enable;  // enable MIGraphX INT8 precision. Default 0 = false, nonzero = true
-  int migraphx_use_native_calibration_table;          // MIGraphx INT8 cal table. Default 0 = false, noznero = true
-  const char* migraphx_int8_calibration_table_name;   // MIGraphx INT8 calibration table name
+  int device_id;                                     // hip device id.
+  int migraphx_fp16_enable;                          // enable MIGraphX FP16 precision. Default 0 = false, nonzero = true
+  int migraphx_int8_enable;                          // enable MIGraphX INT8 precision. Default 0 = false, nonzero = true
+  int migraphx_use_native_calibration_table;         // MIGraphx INT8 cal table. Default 0 = false, noznero = true
+  const char* migraphx_int8_calibration_table_name;  // MIGraphx INT8 calibration table name
 } OrtMIGraphXProviderOptions;
 
 /** \brief OpenVINO Provider Options

Looks like there's an assumption here with python versions. I got it sorted and reran everything

cloudhan · 2023-10-24T03:46:42Z

@justinchuby Is it possible to set the Lint pipeline to automatically run without needing member approve?

Changed during format for line length errors/lintrunner. Added delimiters

TedThemistokleous · 2023-11-06T20:52:02Z

@cloudhan /@faxu rebased changes off main since the eigen issue was breaking local builds.

faxu · 2023-11-06T22:13:30Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline

faxu · 2023-11-06T22:14:05Z

/azp run onnxruntime-python-checks-ci-pipeline, orttraining-amd-gpu-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed,
ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline, Linux QNN CI Pipeline, Windows x64 QNN CI Pipeline

azure-pipelines · 2023-11-06T22:14:12Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2023-11-06T22:14:37Z

Azure Pipelines successfully started running 8 pipeline(s).

faxu · 2023-11-07T00:45:55Z

/azp run Windows ARM64 QNN CI Pipeline

azure-pipelines · 2023-11-07T00:46:05Z

Azure Pipelines successfully started running 1 pipeline(s).

PeixuanZuo · 2023-11-07T06:12:09Z

/azp run Linux MIGraphX CI Pipeline

azure-pipelines · 2023-11-07T06:12:22Z

Azure Pipelines successfully started running 1 pipeline(s).

PeixuanZuo · 2023-11-08T00:45:10Z

@TedThemistokleous , could you please take a look about failed lint c++ CI?

jeffdaily · 2023-11-08T18:18:50Z

@TedThemistokleous , could you please take a look about failed lint c++ CI?

@PeixuanZuo there is something strange about linting. For some reason our ROCm-related PRs flag far more linting issues than other non-ROCm PRs. It really hampers how quickly we can land PRs. For example, it flags include/onnxruntime/core/session/onnxruntime_c_api.h:603 which is inside the struct OrtMIGraphXProviderOptions, but looking 10 lines above inside the struct OrtTensorRTProviderOptions there are 3 lines over 120 long. It's only flagging our newly added lines. Are we being held to a higher standard than other EPs?

Fix errors screaming in lint C++ pass Run another lintrunner pass just incase we get conflicts again

PeixuanZuo · 2023-11-09T02:35:01Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline

azure-pipelines · 2023-11-09T02:35:40Z

Azure Pipelines successfully started running 9 pipeline(s).

PeixuanZuo · 2023-11-09T02:35:48Z

/azp run onnxruntime-python-checks-ci-pipeline, orttraining-amd-gpu-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed,
ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline, Linux QNN CI Pipeline, Windows x64 QNN CI Pipeline, Windows ARM64 QNN CI Pipeline, Linux MIGraphX CI Pipeline

azure-pipelines · 2023-11-09T02:36:24Z

Azure Pipelines successfully started running 10 pipeline(s).

PeixuanZuo · 2023-11-09T03:39:10Z

@TedThemistokleous , could you please take a look about failed lint c++ CI?

@PeixuanZuo there is something strange about linting. For some reason our ROCm-related PRs flag far more linting issues than other non-ROCm PRs. It really hampers how quickly we can land PRs. For example, it flags include/onnxruntime/core/session/onnxruntime_c_api.h:603 which is inside the struct OrtMIGraphXProviderOptions, but looking 10 lines above inside the struct OrtTensorRTProviderOptions there are 3 lines over 120 long. It's only flagging our newly added lines. Are we being held to a higher standard than other EPs?

Lint CI doesn’t treat one EP as special. It runs lint on this PR instead of running lint on all code in the code base, so it only shows warnings about newly added lines.

### Description  Adding static int8 quantization support for MIGraphX Execution Provider - Allows for parsing in calibration tables generated by Onnxruntime or TensorRT's toolsets - Add proper environment variables into the MIGraphX EP - Update python API to include updating execution provider flags -> was missing on python side - Hook into MIGraphX's int8 quantitation and optimization of models ### Motivation and Context  Required so that we can get onnxruntime to pass in models while leveraging the existing tooling for int8 static QDQ quantization. First step in a series of PRs which will add further static quantization on the operator level as MIGraphX releases further support. These changes drew heavily from the tensorRT EP should allow for similar functionality for GPU based (versus CPU) quantization of models before an inference is performed. --------- Co-authored-by: Ted Themistokleous <[email protected]> Co-authored-by: Ted Themistokleous <[email protected]>

This was referenced Oct 13, 2023

Enable Int8 Support for MIGraphX EP ROCm/AMDMIGraphX#2203

Closed

Add migraphx example for resnet50 microsoft/onnxruntime-inference-examples#318

Merged

TedThemistokleous marked this pull request as ready for review October 17, 2023 22:48

This comment was marked as outdated.

Sign in to view

github-advanced-security bot found potential problems Oct 18, 2023

View reviewed changes

onnxruntime/python/tools/transformers/benchmark.py Fixed Show fixed Hide fixed

This comment was marked as outdated.

Sign in to view

github-advanced-security bot found potential problems Oct 19, 2023

View reviewed changes

onnxruntime/python/tools/transformers/benchmark.py Fixed Show resolved Hide resolved

onnxruntime/python/tools/transformers/benchmark.py Fixed Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

github-advanced-security bot found potential problems Oct 20, 2023

View reviewed changes

onnxruntime/python/tools/transformers/benchmark.py Fixed Show fixed Hide fixed

onnxruntime/python/tools/transformers/benchmark.py Fixed Show fixed Hide fixed

onnxruntime/python/tools/transformers/benchmark.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Oct 20, 2023

View reviewed changes

onnxruntime/python/tools/transformers/benchmark.py Fixed Show fixed Hide fixed

TedThemistokleous force-pushed the migraphx_int8_support branch from 35225d1 to b082a99 Compare October 20, 2023 14:55

TedThemistokleous changed the title ~~Migraphx Static int8 QDQ support~~ [Migraphx EP] Static int8 QDQ support Oct 23, 2023

Ted Themistokleous added 2 commits November 6, 2023 20:02

Fix CUDA comment

938dd45

Changed during format for line length errors/lintrunner. Added delimiters

Final lintrunner pass for CI

7c51b94

TedThemistokleous force-pushed the migraphx_int8_support branch from 9e67aba to 7c51b94 Compare November 6, 2023 20:51

TedThemistokleous mentioned this pull request Nov 7, 2023

Onnxrt 11 7 sync merge ROCm/onnxruntime#22

Closed

jeffdaily-ms previously approved these changes Nov 8, 2023

View reviewed changes

Fix additional lint errors.

037d928

Fix errors screaming in lint C++ pass Run another lintrunner pass just incase we get conflicts again

TedThemistokleous dismissed jeffdaily-ms’s stale review via 037d928 November 8, 2023 19:12

TedThemistokleous requested a review from jeffdaily-ms November 8, 2023 19:20

PeixuanZuo approved these changes Nov 9, 2023

View reviewed changes

PeixuanZuo merged commit 8d50313 into microsoft:main Nov 9, 2023
69 checks passed

TedThemistokleous deleted the migraphx_int8_support branch November 9, 2023 14:49

[Migraphx EP] Static int8 QDQ support #17931

[Migraphx EP] Static int8 QDQ support #17931

Conversation

TedThemistokleous commented Oct 13, 2023

Description

Motivation and Context

TedThemistokleous commented Oct 13, 2023

TedThemistokleous commented Oct 17, 2023

This comment was marked as outdated.

This comment was marked as outdated.

TedThemistokleous commented Oct 18, 2023

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

cloudhan commented Oct 19, 2023

TedThemistokleous commented Oct 19, 2023 • edited Loading

TedThemistokleous commented Oct 19, 2023

This comment was marked as outdated.

This comment was marked as outdated.

cloudhan commented Oct 20, 2023

azure-pipelines bot commented Oct 20, 2023

TedThemistokleous commented Oct 20, 2023

cloudhan commented Oct 23, 2023

cloudhan commented Oct 23, 2023

TedThemistokleous commented Oct 23, 2023

cloudhan commented Oct 24, 2023

TedThemistokleous commented Nov 6, 2023 • edited Loading

faxu commented Nov 6, 2023

faxu commented Nov 6, 2023

azure-pipelines bot commented Nov 6, 2023

azure-pipelines bot commented Nov 6, 2023

faxu commented Nov 7, 2023

azure-pipelines bot commented Nov 7, 2023

PeixuanZuo commented Nov 7, 2023

azure-pipelines bot commented Nov 7, 2023

PeixuanZuo commented Nov 8, 2023

jeffdaily commented Nov 8, 2023

PeixuanZuo commented Nov 9, 2023

azure-pipelines bot commented Nov 9, 2023

PeixuanZuo commented Nov 9, 2023

azure-pipelines bot commented Nov 9, 2023

PeixuanZuo commented Nov 9, 2023

TedThemistokleous commented Oct 19, 2023 •

edited

Loading

TedThemistokleous commented Nov 6, 2023 •

edited

Loading