Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TensorRT EP] Enhance EP context configs in session options and provider options #19154

Merged
merged 26 commits into from
Jan 21, 2024

Conversation

chilo-ms
Copy link
Contributor

@chilo-ms chilo-ms commented Jan 15, 2024

Several changes:

  1. To align with other EPs' setting of EP context configs in session options, for example QNN EP, EP context configs for TRT EP can be configured through:
    1. Session Options: ep.context_enable, ep.context_file_path and ep.context_embed_mode
    2. Provider Options: trt_dump_ep_context_model, trt_ep_context_file_path and trt_dump_ep_context_embed_mode
    3. Above setting has 1:1 mapping and provider options has higher priority over session options.
    Please note that there are rules for using following context model related provider options:

     1. In the case of dumping the context model and loading the context model,
        for security reason, TRT EP doesn't allow the "ep_cache_context" node attribute of EP context node to be
        the absolute path or relative path that is outside of context model directory.
        It means engine cache needs to be in the same directory or sub-directory of context model.

     2. In the case of dumping the context model, the engine cache path will be changed to the relative path of context model directory.
        For example:
        If "trt_dump_ep_context_model" is enabled and "trt_engine_cache_enable" is enabled,
           if "trt_ep_context_file_path" is "./context_model_dir",
           - if "trt_engine_cache_path" is "" -> the engine cache will be saved to "./context_model_dir"
           - if "trt_engine_cache_path" is "engine_dir" -> the engine cache will be saved to "./context_model_dir/engine_dir"
  1. User can decide the naming of the dumped "EP context" model by using trt_ep_context_file_path, please see GetCtxModelPath() for more details.

  2. Added suggested comments from [TensorRT EP] Load precompiled TRT engine file directly  #18217

@chilo-ms chilo-ms requested a review from jywu-msft January 15, 2024 22:58
@chilo-ms
Copy link
Contributor Author

@jywu-msft @gedoensmax
Please help review it and I will add unit tests for the PR as well.

Copy link
Contributor

@gedoensmax gedoensmax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great even including the plugin change son the wrapper and the embed mode warning. Thanks !
Just want to make sure I understand the options order correctly.

onnxruntime/core/session/provider_bridge_ort.cc Outdated Show resolved Hide resolved
@chilo-ms chilo-ms marked this pull request as ready for review January 17, 2024 22:41
@chilo-ms
Copy link
Contributor Author

chilo-ms commented Jan 19, 2024

@jywu-msft @gedoensmax
TRT enforces some rules about EP context provider options, please see the updated description.

jywu-msft
jywu-msft previously approved these changes Jan 20, 2024
jywu-msft
jywu-msft previously approved these changes Jan 20, 2024
@chilo-ms
Copy link
Contributor Author

@jywu-msft
I just updated the unit test comment and please help sign off again, thanks!

@jywu-msft jywu-msft merged commit f3402de into main Jan 21, 2024
92 of 94 checks passed
@jywu-msft jywu-msft deleted the chi/trt_engine_wrapper_2 branch January 21, 2024 18:52
YUNQIUGUO pushed a commit that referenced this pull request Jan 23, 2024
…der options (#19154)

Several changes:

1. To align with other EPs' setting of EP context configs in session
options, for example [QNN
EP](#18877), EP context
configs for TRT EP can be configured through:
1. Session Options: `ep.context_enable`, `ep.context_file_path` and
`ep.context_embed_mode`
2. Provider Options: `trt_dump_ep_context_model`,
`trt_ep_context_file_path` and `trt_dump_ep_context_embed_mode`
3. Above setting has 1:1 mapping and provider options has higher
priority over session options.
    
```
    Please note that there are rules for using following context model related provider options:

     1. In the case of dumping the context model and loading the context model,
        for security reason, TRT EP doesn't allow the "ep_cache_context" node attribute of EP context node to be
        the absolute path or relative path that is outside of context model directory.
        It means engine cache needs to be in the same directory or sub-directory of context model.

     2. In the case of dumping the context model, the engine cache path will be changed to the relative path of context model directory.
        For example:
        If "trt_dump_ep_context_model" is enabled and "trt_engine_cache_enable" is enabled,
           if "trt_ep_context_file_path" is "./context_model_dir",
           - if "trt_engine_cache_path" is "" -> the engine cache will be saved to "./context_model_dir"
           - if "trt_engine_cache_path" is "engine_dir" -> the engine cache will be saved to "./context_model_dir/engine_dir"
```    

2. User can decide the naming of the dumped "EP context" model by using
`trt_ep_context_file_path`, please see GetCtxModelPath() for more
details.

3. Added suggested comments from
#18217
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants