Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] export weights as a constants in graph, so can do constant folding to them #19278

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zhijxu-MS
Copy link
Contributor

@zhijxu-MS zhijxu-MS commented Jan 26, 2024

when supporting one 1p model inference, the weights of model are graph inputs and thus ORT's constant folding optimizer can't optimize the weight. after export weights as constants in graph, we see 10%+ gain in the model.

this is a draft

as we need to think of the scenario: model trained > model eval > model trained > model eval > ..., in such case, the weights are changed so we can't just export them as constant directly. one idea is we keep an model_version in trainingsession and also in inferencesession, if we found the model_version number mismatch, then we reexport it.

@zhijxu-MS zhijxu-MS changed the title WIP] export weights as a constants in graph, so can do constant folding to them [WIP] export weights as a constants in graph, so can do constant folding to them Jan 26, 2024
@zhijxu-MS zhijxu-MS force-pushed the zhijxu/improve-ortmodule-inference branch from b70cd75 to 604695a Compare January 31, 2024 11:31
@zhijxu-MS zhijxu-MS force-pushed the zhijxu/improve-ortmodule-inference branch from 604695a to 8bcd1d8 Compare February 19, 2024 08:54
@zhijxu-MS zhijxu-MS force-pushed the zhijxu/improve-ortmodule-inference branch from 8bcd1d8 to 955cb5b Compare February 19, 2024 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant