Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] #21316

Open
lnotgm opened this issue Jul 11, 2024 · 2 comments
Open

[Feature Request] #21316

lnotgm opened this issue Jul 11, 2024 · 2 comments
Assignees
Labels
ep:TensorRT issues related to TensorRT execution provider feature request request for unsupported feature or enhancement

Comments

@lnotgm
Copy link

lnotgm commented Jul 11, 2024

Describe the feature request

Can onnxruntime support directly loading *.engine or *.trt to initialize the session when using TensorRT EP?

Describe scenario use case

Currently using TensorRT EP, in order to ensure the initialization speed, the *.engine is written to the hard disk when using the cache. In my production environment, I want *.engine to be encrypted. And they can be directly loaded in the following way.
//Decrypt model to byte
byte=Decryptor.read("./encrypted_model.engine");

//Initialize to session
session = Ort::Session(env, byte.data(), byte.size(), session_options);

@lnotgm lnotgm added the feature request request for unsupported feature or enhancement label Jul 11, 2024
@github-actions github-actions bot added the ep:TensorRT issues related to TensorRT execution provider label Jul 11, 2024
@jywu-msft jywu-msft assigned jywu-msft and chilo-ms and unassigned jywu-msft Jul 29, 2024
@jywu-msft
Copy link
Member

@chilo-ms will share details

@chilo-ms
Copy link
Contributor

chilo-ms commented Sep 11, 2024

@lnotgm
The "model" input to Ort::Session() should be ONNX format regardless of file path or byte stream.
TensorRT EP provides the feature of using an embedded engine model, similar to your request, it's just that the input model is a wrapper of engine (still an ONNX file). This can also fulfill your request of almost directly loading the *.engine to reduce session initialization time.
Please see the detail of using "Embedded engine model / EPContext model":
https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#tensorrt-ep-caches

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:TensorRT issues related to TensorRT execution provider feature request request for unsupported feature or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants