Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Detect TensorRT cache creation #22244

Closed
henryruhs opened this issue Sep 27, 2024 · 3 comments
Closed

[Feature Request] Detect TensorRT cache creation #22244

henryruhs opened this issue Sep 27, 2024 · 3 comments
Labels
ep:TensorRT issues related to TensorRT execution provider feature request request for unsupported feature or enhancement

Comments

@henryruhs
Copy link

henryruhs commented Sep 27, 2024

Describe the feature request

We have the problem that creating the .engine and .profile files for TensorRT take too much time and there is no feedback for the user what is actual happening.

Therefore I suggest adding a general "get_state" to onnxruntime.InferenceSession in order to figure out why it is currently blocked. Like is_processing, is_caching etc.

I would be happy with any other solution to detect the "cache creation" for now

Describe scenario use case

Give the user better feedback while creating caches for TensorRT.

@henryruhs henryruhs added the feature request request for unsupported feature or enhancement label Sep 27, 2024
@github-actions github-actions bot added the ep:TensorRT issues related to TensorRT execution provider label Sep 27, 2024
@henryruhs
Copy link
Author

henryruhs commented Oct 7, 2024

There seems to be no logic to invalidate the existing cache.

I experienced exceptions after updating the version of TensorRT.

@chilo-ms
Copy link
Contributor

chilo-ms commented Oct 7, 2024

Thanks for the feedback.

TRT does require amount of time when building the engine if the model is complex.
You can enable TRT EP's provider option trt_detailed_build_log for now which gives you more verbose logging and it keeps printing out information when TRT is doing kernels profiling/selection for building the engine.

Caching/serialize the engine is quick and same as deserializing the engine cache. So, as long as there is a corresponding engine file on the disk, TRT EP will deserialize the engine without rebuilding it, only if the input shape changes it will rebuild the engine.

There seems to be no logic to invalidate the existing cache.

yea it seems no tool to validate engine, using engine cache built with different TRT throws exception directly.
TRT has the feature of building version compatible engine, but TRT EP hasn't integrated it.

@henryruhs
Copy link
Author

Okay, it sounds like there is not too much you can do about both topics. I give trt_detailed_build_log a shot and close this issue in meanwhile. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:TensorRT issues related to TensorRT execution provider feature request request for unsupported feature or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants