Move Gelu and LayerNorm fusion to L1 optimization #21332

peishenyan · 2024-07-12T05:29:39Z

According to #20915, we move the Gelu and LayerNorm fusion to L1 with a condition on the ONNX opset the model imports (LayerNorm requires opset 16+ and Gelu requires opset 20+.) If the opset version doesn't meet the requirements, the fusion is delayed to L2 optimization since the internal contrib op doesn't have a requirement for any specific ONNX opset.

peishenyan · 2024-07-12T06:42:01Z

It seems that I should also change the test case since we have two chances for Gelu / LayerNorm Fusion (both L1 and L2 optimization), but recent test case only register level 2 optimization graph_transformation_mgr.Register(std::make_unique<GeluFusion>(), TransformerLevel::Level2));

WIP.

peishenyan · 2024-07-16T03:19:34Z

Hi @skottmckay @fdwr @guschmue , I've moved the Gelu and LayerNorm fusion to L1 with a condition on the ONNX opset the model imports, and retained the possibility that the fusion can be delayed to L2 optimization if the opset version doesn't meet the requirements. I also modified the corresponding GeluFusion and LayerNormFusion test.

PTAL, thanks.

peishenyan · 2024-07-29T16:33:53Z

Hi @skottmckay , I hope you're doing well. When you have a moment, could you please take a look at the PR I submitted? I believe your feedback would be very valuable. Thank you for your time!❤

onnxruntime/core/optimizer/layer_norm_fusion.cc

onnxruntime/core/optimizer/gelu_fusion.h

onnxruntime/core/optimizer/layer_norm_fusion.cc

onnxruntime/core/optimizer/gelu_fusion.cc

onnxruntime/test/optimizer/graph_transform_test.cc

onnxruntime/core/optimizer/gelu_fusion.h

onnxruntime/core/optimizer/gelu_fusion.cc

onnxruntime/test/optimizer/graph_transform_test.cc

skottmckay · 2024-08-12T08:58:03Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

skottmckay · 2024-08-12T08:58:05Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

skottmckay · 2024-08-12T08:58:06Z

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-08-12T08:58:23Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-08-12T08:58:39Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-08-12T08:58:42Z

Azure Pipelines successfully started running 10 pipeline(s).

peishenyan · 2024-08-12T10:44:51Z

@skottmckay Sorry, I didn't realize the error in the code until I saw checks error. The newest commit has fixed the bugs. Please rerun the checks. Thanks.

skottmckay · 2024-08-13T21:45:52Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

skottmckay · 2024-08-13T21:45:54Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

skottmckay · 2024-08-13T21:45:55Z

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-08-13T21:46:13Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2024-08-13T21:46:27Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-08-13T21:46:32Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2024-09-02T23:47:46Z

Azure Pipelines successfully started running 10 pipeline(s).

peishenyan · 2024-09-03T05:47:44Z

Unfortunately the checks still fail. I think they are related to the GeluFusion and LayerNormFusion in orttraining/orttraining/core/optimizer/graph_transformer_utils.cc. They originally fuse the ops at level 1 without limitation, but after my modification to these two transformers, they can only fuse the ops with satisfiable opsets.

I think the solution might be to change

case TransformerLevel::Level1: {
...
      transformers.emplace_back(std::make_unique<LayerNormFusion>(compatible_eps));
...
      transformers.emplace_back(std::make_unique<GeluFusion>(compatible_eps));
...
}

to

case TransformerLevel::Level1: {
...
      transformers.emplace_back(std::make_unique<LayerNormFusion>(compatible_eps));
      transformers.emplace_back(std::make_unique<LayerNormFusion>(compatible_eps, TransformerLevel::Level2));
...
      transformers.emplace_back(std::make_unique<GeluFusion>(compatible_eps));
      transformers.emplace_back(std::make_unique<GeluFusion>(compatible_eps, TransformerLevel::Level2));
...
}

to fix this bug.

@skottmckay Are there any suggestions you'd like to offer? Thanks.

skottmckay · 2024-09-03T08:43:54Z

Maybe add an additional optional argument to each optimizer's ctor to allow forcing it to only use the contrib op. Set to true when the optimizers are added in orttraining/orttraining/core/optimizer/graph_transformer_utils.cc.

peishenyan · 2024-09-03T11:34:21Z

Good idea!

onnxruntime/core/optimizer/gelu_fusion.cc

orttraining/orttraining/core/optimizer/compute_optimizer/padding_elimination.cc

…ng_elimination.cc

skottmckay · 2024-09-05T00:44:28Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

skottmckay · 2024-09-05T00:44:29Z

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline

skottmckay · 2024-09-05T00:44:31Z

/azp run Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-09-05T00:44:54Z

Azure Pipelines successfully started running 5 pipeline(s).

azure-pipelines · 2024-09-05T00:45:07Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2024-09-05T00:45:08Z

Azure Pipelines successfully started running 10 pipeline(s).

onnxruntime/core/optimizer/gelu_fusion.cc

skottmckay · 2024-09-06T22:45:42Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

skottmckay · 2024-09-06T22:45:43Z

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline

skottmckay · 2024-09-06T22:45:45Z

/azp run Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-09-06T22:46:07Z

Azure Pipelines successfully started running 5 pipeline(s).

azure-pipelines · 2024-09-06T22:46:18Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2024-09-06T22:46:41Z

Azure Pipelines successfully started running 10 pipeline(s).

addGelu and LayerNorm fusion in L1 optimization

df6a319

peishenyan changed the title ~~Move Gelu and LayerNorm fusion in L1 optimization~~ Move Gelu and LayerNorm fusion to L1 optimization Jul 12, 2024

modify test case

bb0fa6b

peishenyan added 2 commits July 12, 2024 15:23

Merge branch 'microsoft:main' into l1_optimize

c93030a

fix test bugs

c0e7edb

skottmckay reviewed Jul 30, 2024

View reviewed changes

peishenyan added 2 commits August 2, 2024 14:13

address comments

a53e9e9

fix bugs

6cf1e84

skottmckay reviewed Aug 6, 2024

View reviewed changes

onnxruntime/core/optimizer/gelu_fusion.h Outdated Show resolved Hide resolved

onnxruntime/core/optimizer/gelu_fusion.cc Outdated Show resolved Hide resolved

onnxruntime/test/optimizer/graph_transform_test.cc Outdated Show resolved Hide resolved

peishenyan added 2 commits August 6, 2024 10:26

add comments

9b4e204

add test case for GeluFusion with opset=20

dfd8258

skottmckay previously approved these changes Aug 12, 2024

View reviewed changes

fix bugs

aa570c8

peishenyan dismissed skottmckay’s stale review via aa570c8 August 12, 2024 10:44

add contrib_flag for GeluFusion and LayerNormFusion optimizers

ba663d1

skottmckay reviewed Sep 4, 2024

View reviewed changes

onnxruntime/core/optimizer/gelu_fusion.cc Outdated Show resolved Hide resolved

address comments

019b4af

skottmckay reviewed Sep 5, 2024

View reviewed changes

orttraining/orttraining/core/optimizer/compute_optimizer/padding_elimination.cc Outdated Show resolved Hide resolved

Update orttraining/orttraining/core/optimizer/compute_optimizer/paddi…

17e8d5c

…ng_elimination.cc

skottmckay previously approved these changes Sep 6, 2024

View reviewed changes

edgchen1 reviewed Sep 6, 2024

View reviewed changes

onnxruntime/core/optimizer/gelu_fusion.cc Show resolved Hide resolved

edgchen1 previously approved these changes Sep 6, 2024

View reviewed changes

Add comments about assumption of optimizers being registered in L1.

b096a8e

edgchen1 dismissed stale reviews from skottmckay and themself via b096a8e September 6, 2024 22:43

skottmckay approved these changes Sep 6, 2024

View reviewed changes

skottmckay merged commit 2cdc05f into microsoft:main Sep 9, 2024
82 of 84 checks passed

peishenyan mentioned this pull request Sep 9, 2024

[Feature Request] Move graph compilation behind higher transformers (graph optimization) #20915

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move Gelu and LayerNorm fusion to L1 optimization #21332

Move Gelu and LayerNorm fusion to L1 optimization #21332

peishenyan commented Jul 12, 2024

peishenyan commented Jul 12, 2024

peishenyan commented Jul 16, 2024

peishenyan commented Jul 29, 2024

skottmckay commented Aug 12, 2024

skottmckay commented Aug 12, 2024

skottmckay commented Aug 12, 2024

azure-pipelines bot commented Aug 12, 2024

azure-pipelines bot commented Aug 12, 2024

azure-pipelines bot commented Aug 12, 2024

peishenyan commented Aug 12, 2024

skottmckay commented Aug 13, 2024

skottmckay commented Aug 13, 2024

skottmckay commented Aug 13, 2024

azure-pipelines bot commented Aug 13, 2024

azure-pipelines bot commented Aug 13, 2024

azure-pipelines bot commented Aug 13, 2024

azure-pipelines bot commented Sep 2, 2024

peishenyan commented Sep 3, 2024

skottmckay commented Sep 3, 2024

peishenyan commented Sep 3, 2024

skottmckay commented Sep 5, 2024

skottmckay commented Sep 5, 2024

skottmckay commented Sep 5, 2024

azure-pipelines bot commented Sep 5, 2024

azure-pipelines bot commented Sep 5, 2024

azure-pipelines bot commented Sep 5, 2024

skottmckay commented Sep 6, 2024

skottmckay commented Sep 6, 2024

skottmckay commented Sep 6, 2024

azure-pipelines bot commented Sep 6, 2024

azure-pipelines bot commented Sep 6, 2024

azure-pipelines bot commented Sep 6, 2024

Move Gelu and LayerNorm fusion to L1 optimization #21332

Move Gelu and LayerNorm fusion to L1 optimization #21332

Conversation

peishenyan commented Jul 12, 2024

peishenyan commented Jul 12, 2024

peishenyan commented Jul 16, 2024

peishenyan commented Jul 29, 2024

skottmckay commented Aug 12, 2024

skottmckay commented Aug 12, 2024

skottmckay commented Aug 12, 2024

azure-pipelines bot commented Aug 12, 2024

azure-pipelines bot commented Aug 12, 2024

azure-pipelines bot commented Aug 12, 2024

peishenyan commented Aug 12, 2024

skottmckay commented Aug 13, 2024

skottmckay commented Aug 13, 2024

skottmckay commented Aug 13, 2024

azure-pipelines bot commented Aug 13, 2024

azure-pipelines bot commented Aug 13, 2024

azure-pipelines bot commented Aug 13, 2024

azure-pipelines bot commented Sep 2, 2024

peishenyan commented Sep 3, 2024

skottmckay commented Sep 3, 2024

peishenyan commented Sep 3, 2024

skottmckay commented Sep 5, 2024

skottmckay commented Sep 5, 2024

skottmckay commented Sep 5, 2024

azure-pipelines bot commented Sep 5, 2024

azure-pipelines bot commented Sep 5, 2024

azure-pipelines bot commented Sep 5, 2024

skottmckay commented Sep 6, 2024

skottmckay commented Sep 6, 2024

skottmckay commented Sep 6, 2024

azure-pipelines bot commented Sep 6, 2024

azure-pipelines bot commented Sep 6, 2024

azure-pipelines bot commented Sep 6, 2024