Drop QDQ around more nodes #21376

mcollinswisc · 2024-07-16T19:25:51Z

Description

Extends the Drop QDQ optimization to remove DequantizeLinear and QuantizeLinear nodes from around operators:

Flatten
Expand
Tile
Slice
GatherElements
ReduceMin
ReduceMax

Motivation and Context

To reduce floating-point conversions in quantize inference. Mainly motivated by the Flatten case, since that will show up in graphs exported from PyTorch to ONNX. But to make the change complete, extending to a larger set of ops for which this optimization is valid.

#21375

With matching quantization parameters: DequantizeLinear ∘ Flatten ∘ QuantizeLinear is equivalent to just the Flatten, and it saves some floating- point computations. There's already support for a similar optimization for an equivalent Reshape: this change extends the existing optimization to also recognize Flatten. microsoft#21167

Currently, the DropQDQNodesRules optimization removes QuantizeLinear and DequantizeLinear nodes from DequantizeLinear∘MaxPool∘QuantizeLinear. However, if the x_scale/y_scale values are non-positive, this changes the ordering of the elements in the input value, so this optimization is changing the results. This change adds a check for whether the scale in the QuantizeLinear (or DequantizeLinear) is a positive scalar, and a new selector to disallow removing the QDQ around MaxPool if it is not. microsoft#21176

Only does so if the scale is positive

https://github.com/microsoft/onnxruntime/actions/runs/9684819243/job/26883605849

Updating to match microsoft@07c4291

No integer implementations are present, so they need to stay in floating-point. microsoft#21287

Don't expect the drop qdq optimization to work for multiple inputs for now.

Apparently the type constraints for these ops don't include 16-bit integers.

Results don't appear to match

mcollinswisc · 2024-07-16T19:27:21Z

Changes are on top of #21182, since it also needs to check for positive scale when dropping QDQ around ReduceMin and ReduceMax

To keep lines under 120 chars

skottmckay · 2024-08-02T01:10:22Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

skottmckay · 2024-08-02T01:10:24Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

skottmckay · 2024-08-02T01:10:25Z

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-08-02T01:10:43Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-08-02T01:10:57Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-08-02T01:11:02Z

Azure Pipelines successfully started running 10 pipeline(s).

I guess these are no longer lined up anyway after moving some to previous line.

skottmckay · 2024-08-15T06:46:58Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

skottmckay · 2024-08-15T06:47:00Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

skottmckay · 2024-08-15T06:47:01Z

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-08-15T06:47:19Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-08-15T06:47:34Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-08-15T06:47:40Z

Azure Pipelines successfully started running 10 pipeline(s).

mcollinswisc · 2024-08-15T15:28:48Z

I had missed the reason why the windows builds were failing last few commits (currently don't have a Windows system to try locally), sorry

I guess since it's

onnxruntime\test\optimizer\qdq_transformer_test.cc(1,1): Error C1128: number of sections exceeded object file format limit: compile with /bigobj

I'll try adding the /bigobj flag similarly to here:

onnxruntime/cmake/onnxruntime_unittests.cmake

Line 892 in b9f3a5d

set_property(SOURCE "${TEST_SRC_DIR}/optimizer/graph_transform_test.cc"

Seeing: fatal error C1128: number of sections exceeded object file format limit: compile with /bigobj so apparently these additional tests are pushing this file over the limit. Given there's already a statement setting /bigobj for sibling graph_transform_test, simply copy-pasting that for qdq_transformer_test

skottmckay · 2024-08-26T00:52:37Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

skottmckay · 2024-08-26T00:52:39Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

skottmckay · 2024-08-26T00:52:40Z

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-08-26T00:52:56Z

Azure Pipelines successfully started running 2 pipeline(s), but failed to run 1 pipeline(s).

azure-pipelines · 2024-08-26T00:53:11Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-08-26T00:53:15Z

Azure Pipelines successfully started running 10 pipeline(s).

skottmckay · 2024-08-27T04:39:02Z

/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2024-08-27T04:39:18Z

Azure Pipelines successfully started running 3 pipeline(s).

mcollinswisc and others added 27 commits June 25, 2024 17:20

Add more operators for which QDQ can be removed

b3b506b

Unit test on QDQ w/ nonpositive scale around MaxPool

61c5d84

Merge branch 'main' into qdq_optim

e3176e6

Merge branch 'qdq_optim_nonpositive_scale' into qdq_optim

e84b3e6

Unit test on removing QDQ around Expand

ce1ee8a

Add selector to remove QDQ nodes around Min, Max, and Abs

73cc212

Only does so if the scale is positive

Change formatting according to clangformat

3b94f98

https://github.com/microsoft/onnxruntime/actions/runs/9684819243/job/26883605849

Unit test that QDQ nodes are remove around Tile

daf808e

Merge branch 'qdq_optim_nonpositive_scale' into qdq_optim

641747f

Merge branch 'main' into qdq_optim

7e5db77

Merge branch 'main' into qdq_optim_nonpositive_scale

58e525c

Switch to std::fileystem::path in IsQOrDQScalePositiveConstantScalar

68def1e

Updating to match microsoft@07c4291

Merge branch 'qdq_optim_nonpositive_scale' into qdq_optim

342034d

Remove SpaceToDepth and SpaceToDepth from Drop QDQ optimization

cf10b50

No integer implementations are present, so they need to stay in floating-point. microsoft#21287

Fix grammar in comment

939f240

Unit test that QDQ is dropped around Slice

452bdd1

Unit test on removing QDQ around GatherElements

a67d3fb

Apply lintrunner

c7a50da

Drop QDQ around ReduceMin & ReduceMax, not Min & Max

82fdcc4

Don't expect the drop qdq optimization to work for multiple inputs for now.

Disallow 16bit for ReduceMin & ReduceMax

9b3bf09

Apparently the type constraints for these ops don't include 16-bit integers.

Unit test on dropping QDQ from around ReduceMin/Max

872f983

Fix comment in ReduceExtremumDropQDQ test case

5bf7d84

Remove selector to drop QDQ around Abs

28824d6

Results don't appear to match

Merge branch 'main' into qdq_optim

83d85f3

Reformatting from lintrunner

71253a8

jywu-msft requested a review from adrianlizarraga July 18, 2024 16:57

skottmckay mentioned this pull request Jul 25, 2024

Add QDQ handling for Expand/Flatten/Tile if CPU EP is taking them #21473

Closed

mcollinswisc added 2 commits August 1, 2024 09:51

Move some comments to line before

fb91e41

To keep lines under 120 chars

Merge branch 'main' into qdq_optim

49edace

mcollinswisc marked this pull request as ready for review August 1, 2024 16:57

mcollinswisc requested a review from skottmckay August 1, 2024 17:01

Change spacing around comments according to clang-format/lintrunner

85010fe

I guess these are no longer lined up anyway after moving some to previous line.

mcollinswisc added 2 commits August 15, 2024 08:32

Merge branch 'main' into qdq_optim

b7fd7b7

skottmckay approved these changes Aug 27, 2024

View reviewed changes

skottmckay merged commit 5d54dc1 into microsoft:main Aug 27, 2024
80 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop QDQ around more nodes #21376

Drop QDQ around more nodes #21376

mcollinswisc commented Jul 16, 2024

mcollinswisc commented Jul 16, 2024

skottmckay commented Aug 2, 2024

skottmckay commented Aug 2, 2024

skottmckay commented Aug 2, 2024

azure-pipelines bot commented Aug 2, 2024

azure-pipelines bot commented Aug 2, 2024

azure-pipelines bot commented Aug 2, 2024

skottmckay commented Aug 15, 2024

skottmckay commented Aug 15, 2024

skottmckay commented Aug 15, 2024

azure-pipelines bot commented Aug 15, 2024

azure-pipelines bot commented Aug 15, 2024

azure-pipelines bot commented Aug 15, 2024

mcollinswisc commented Aug 15, 2024 •

edited

Loading

skottmckay commented Aug 26, 2024

skottmckay commented Aug 26, 2024

skottmckay commented Aug 26, 2024

azure-pipelines bot commented Aug 26, 2024

azure-pipelines bot commented Aug 26, 2024

azure-pipelines bot commented Aug 26, 2024

skottmckay commented Aug 27, 2024

azure-pipelines bot commented Aug 27, 2024

Drop QDQ around more nodes #21376

Drop QDQ around more nodes #21376

Conversation

mcollinswisc commented Jul 16, 2024

Description

Motivation and Context

mcollinswisc commented Jul 16, 2024

skottmckay commented Aug 2, 2024

skottmckay commented Aug 2, 2024

skottmckay commented Aug 2, 2024

azure-pipelines bot commented Aug 2, 2024

azure-pipelines bot commented Aug 2, 2024

azure-pipelines bot commented Aug 2, 2024

skottmckay commented Aug 15, 2024

skottmckay commented Aug 15, 2024

skottmckay commented Aug 15, 2024

azure-pipelines bot commented Aug 15, 2024

azure-pipelines bot commented Aug 15, 2024

azure-pipelines bot commented Aug 15, 2024

mcollinswisc commented Aug 15, 2024 • edited Loading

skottmckay commented Aug 26, 2024

skottmckay commented Aug 26, 2024

skottmckay commented Aug 26, 2024

azure-pipelines bot commented Aug 26, 2024

azure-pipelines bot commented Aug 26, 2024

azure-pipelines bot commented Aug 26, 2024

skottmckay commented Aug 27, 2024

azure-pipelines bot commented Aug 27, 2024

mcollinswisc commented Aug 15, 2024 •

edited

Loading