-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If Branch Constant Folding #18105
Merged
Merged
If Branch Constant Folding #18105
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Complete Finalize
Currently, there is a problem QDQ pairing transformer that ensures pairing between QDQ pairs including within the graph.
Change the type of the templates map so memory is partially released when not all of the functions are removed. Change the inline naming schema with slashes.
Change the type of the templates map so memory is partially released when not all of the functions are removed. Change the inline naming schema with slashes.
Add tests Rename the function. Remove functions directly from the partitioner.
gramalingam
previously approved these changes
Nov 8, 2023
edgchen1
previously approved these changes
Nov 8, 2023
/azp run Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 3 pipeline(s). |
when GraphViewer is not able to topologically sort new nodes due to lack of edges. We generate edges and make sure that nodes with subgraphs get implicit inputs setup.
yuslepukhin
dismissed stale reviews from edgchen1 and gramalingam
via
November 10, 2023 01:36
60d99fb
edgchen1
reviewed
Nov 13, 2023
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.txt
Outdated
Show resolved
Hide resolved
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.txt
Outdated
Show resolved
Hide resolved
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Dismissed
Show dismissed
Hide dismissed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
onnxruntime/test/testdata/transform/transform_nested_ifs_toplogical_sorted_nodes.py
Fixed
Show fixed
Hide fixed
gramalingam
reviewed
Nov 13, 2023
gramalingam
approved these changes
Nov 14, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments.
pranavsharma
approved these changes
Nov 14, 2023
petermcaughan
pushed a commit
that referenced
this pull request
Nov 20, 2023
…args (#18462) ### Description Truncate traling non-existing arguments. Make sure we do not skip on the non-existing arguments in the middle, because shape inferece relies on their proper position. This also affects the argument position in the Edges that must be properly rebuilt each time If node branch is inlined. Make sure that when we rename Defs in subgraphs, new renamed defs are created in those subgraphs instead of pointing to outer scope defs. Add unit test. ### Motivation and Context This is a follow up for #18105 Currently, the non-trailing arguments are simply ignored and the edges are created with potentially incorrect positions.
kleiti
pushed a commit
to kleiti/onnxruntime
that referenced
this pull request
Mar 22, 2024
### Description When and if `If` condition proves to be a constant value, inline the corresponding subgraph yielding to more constant folding and optimization. ### Motivation and Context Newly converted models feature lots of nested `If` nodes that can be inlined and collapsed. In particular, for the sample models we are gaining on TorchScript exported models. For `HF Mobile Bert Dynamo` runtime went down from 0.069 -> 0.046. In total, AOT inlining + `If` constant folding yields improvement of about 50% 0.102 -> 0.046. Brining us very close to TorchScript exported models. `HF Bart Dynamo` further improves 0.668 -> 0.45. AOT + `If` constant folding improves 0.98 -> 0.45 Earlier the size of HF Mobile Bert **161Mb+**, now **98Mb** HF Bart Dynamo pre-optimized model was about **1.2Gb**. It is now **710MB** ![image](https://github.com/microsoft/onnxruntime/assets/11303988/1491a247-d371-4e66-85a3-2aeb702e8ca0)
kleiti
pushed a commit
to kleiti/onnxruntime
that referenced
this pull request
Mar 22, 2024
…args (microsoft#18462) ### Description Truncate traling non-existing arguments. Make sure we do not skip on the non-existing arguments in the middle, because shape inferece relies on their proper position. This also affects the argument position in the Edges that must be properly rebuilt each time If node branch is inlined. Make sure that when we rename Defs in subgraphs, new renamed defs are created in those subgraphs instead of pointing to outer scope defs. Add unit test. ### Motivation and Context This is a follow up for microsoft#18105 Currently, the non-trailing arguments are simply ignored and the edges are created with potentially incorrect positions.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
When and if
If
condition proves to be a constant value, inline the corresponding subgraph yielding to more constant folding and optimization.Motivation and Context
Newly converted models feature lots of nested
If
nodes that can be inlined and collapsed.In particular, for the sample models we are gaining on TorchScript exported models.
For
HF Mobile Bert Dynamo
runtime went down from 0.069 -> 0.046. In total, AOT inlining +If
constant foldingyields improvement of about 50% 0.102 -> 0.046. Brining us very close to TorchScript exported models.
HF Bart Dynamo
further improves 0.668 -> 0.45. AOT +If
constant folding improves 0.98 -> 0.45Earlier the size of
HF Mobile Bert 161Mb+, now 98Mb
HF Bart Dynamo pre-optimized model was about 1.2Gb. It is now 710MB