[Feature Request] Make DirectML execution provider thread safe (allow Run() concurrency) #22147
Labels
ep:DML
issues related to the DirectML execution provider
feature request
request for unsupported feature or enhancement
Describe the feature request
The DirectML execution provider is not currenly thread-safe, which means that if you have any concurrency with a session on the Run() call (multiple threads calling Run on same session) it will crash.
This goes directly agains the high-level design of ORT and means that users must create N session to achieve N concurrency, which dramatically increases the memory requirements.
Alghough the problem is documented, it is easy to miss as there is a lot of material promoting the use of a single session instance for concurrency.
https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html
"Additionally, as the DirectML execution provider does not support parallel execution, it does not support multi-threaded calls to Run on the same inference session. That is, if an inference session using the DirectML execution provider, only one thread may call Run at a time. Multiple threads are permitted to call Run simultaneously if they operate on different inference session objects."
What are the reasons the EP is not already thread safe?
For me it crashes in the bucketized allocator.
Are there any workarounds?
Describe scenario use case
It would be quite beneficial for any users that use DML EP and that also need concurrency to be able to use a single inference session.
The text was updated successfully, but these errors were encountered: