[Feature Request] Make DirectML execution provider thread safe (allow Run() concurrency) #22147

oysteinkrog · 2024-09-19T07:25:40Z

Describe the feature request

The DirectML execution provider is not currenly thread-safe, which means that if you have any concurrency with a session on the Run() call (multiple threads calling Run on same session) it will crash.

This goes directly agains the high-level design of ORT and means that users must create N session to achieve N concurrency, which dramatically increases the memory requirements.
Alghough the problem is documented, it is easy to miss as there is a lot of material promoting the use of a single session instance for concurrency.

https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html
"Additionally, as the DirectML execution provider does not support parallel execution, it does not support multi-threaded calls to Run on the same inference session. That is, if an inference session using the DirectML execution provider, only one thread may call Run at a time. Multiple threads are permitted to call Run simultaneously if they operate on different inference session objects."

What are the reasons the EP is not already thread safe?
For me it crashes in the bucketized allocator.
Are there any workarounds?

Describe scenario use case

It would be quite beneficial for any users that use DML EP and that also need concurrency to be able to use a single inference session.

Reduced memory cost.
Reduced initialization cost (no need to create N inference sessions).

Djdefrag · 2024-09-29T16:17:11Z

Hi, It seems that since version 1.18.0, multi-threading with n sessions is also broken (#20713)

The use of a single multi-treading compatible session would be great

oysteinkrog · 2024-10-01T08:47:41Z

Hi, It seems that since version 1.18.0, multi-threading with n sessions is also broken (#20713)

The use of a single multi-treading compatible session would be great

I can not confirm this, our application seems to work fine with a unique session per thread.

saulthu · 2024-10-09T01:43:21Z

I can not confirm this, our application seems to work fine with a unique session per thread.

@oysteinkrog
Please see #20713, there are many people running into this problem. I use one session per thread in the Python API. Each thread loads its own model by creating a session object, onnxruntime.InferenceSession(model_path). Running inferences on multiple independent models in different threads very quickly results in Windows fatal exception: access violation if using version >= 1.18.0.

oysteinkrog added the feature request request for unsupported feature or enhancement label Sep 19, 2024

github-actions bot added the ep:DML issues related to the DirectML execution provider label Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Make DirectML execution provider thread safe (allow Run() concurrency) #22147

[Feature Request] Make DirectML execution provider thread safe (allow Run() concurrency) #22147

oysteinkrog commented Sep 19, 2024

Djdefrag commented Sep 29, 2024

oysteinkrog commented Oct 1, 2024

saulthu commented Oct 9, 2024 •

edited

Loading

[Feature Request] Make DirectML execution provider thread safe (allow Run() concurrency) #22147

[Feature Request] Make DirectML execution provider thread safe (allow Run() concurrency) #22147

Comments

oysteinkrog commented Sep 19, 2024

Describe the feature request

Describe scenario use case

Djdefrag commented Sep 29, 2024

oysteinkrog commented Oct 1, 2024

saulthu commented Oct 9, 2024 • edited Loading

saulthu commented Oct 9, 2024 •

edited

Loading