[Feature Request] Expose use_per_session_threads options in the C# layer #19703
Labels
api:CSharp
issues related to the C# API
feature request
request for unsupported feature or enhancement
Describe the feature request
We host a model inferencing service and we run ONNX model inferences with multiple models (>100) simultaneously. Since by default the runtime uses per session (model) thread pool, this generates many threads and we run into contention issue leading to long inferencing latency.
One potential solution is to use the global thread pool instead of using per session thread pools so that models can share thread management. However, the option, use_per_session_threads, is not available in the C# layer and it's true by default. Could you please add this option in the C# runtime so we can configure it? Thanks.
Describe scenario use case
We run multiple (> 100) ONNX model inferences simultaneously in a web service authored with C# OnnxRuntime.
The text was updated successfully, but these errors were encountered: