Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Make DirectML execution provider thread safe (allow Run() concurrency) #22147

Open
oysteinkrog opened this issue Sep 19, 2024 · 3 comments
Labels
ep:DML issues related to the DirectML execution provider feature request request for unsupported feature or enhancement

Comments

@oysteinkrog
Copy link

Describe the feature request

The DirectML execution provider is not currenly thread-safe, which means that if you have any concurrency with a session on the Run() call (multiple threads calling Run on same session) it will crash.

This goes directly agains the high-level design of ORT and means that users must create N session to achieve N concurrency, which dramatically increases the memory requirements.
Alghough the problem is documented, it is easy to miss as there is a lot of material promoting the use of a single session instance for concurrency.

https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html
"Additionally, as the DirectML execution provider does not support parallel execution, it does not support multi-threaded calls to Run on the same inference session. That is, if an inference session using the DirectML execution provider, only one thread may call Run at a time. Multiple threads are permitted to call Run simultaneously if they operate on different inference session objects."

What are the reasons the EP is not already thread safe?
For me it crashes in the bucketized allocator.
Are there any workarounds?

Describe scenario use case

It would be quite beneficial for any users that use DML EP and that also need concurrency to be able to use a single inference session.

  • Reduced memory cost.
  • Reduced initialization cost (no need to create N inference sessions).
@oysteinkrog oysteinkrog added the feature request request for unsupported feature or enhancement label Sep 19, 2024
@github-actions github-actions bot added the ep:DML issues related to the DirectML execution provider label Sep 19, 2024
@Djdefrag
Copy link

Hi, It seems that since version 1.18.0, multi-threading with n sessions is also broken (#20713)

The use of a single multi-treading compatible session would be great

@oysteinkrog
Copy link
Author

Hi, It seems that since version 1.18.0, multi-threading with n sessions is also broken (#20713)

The use of a single multi-treading compatible session would be great

I can not confirm this, our application seems to work fine with a unique session per thread.

@saulthu
Copy link

saulthu commented Oct 9, 2024

I can not confirm this, our application seems to work fine with a unique session per thread.

@oysteinkrog
Please see #20713, there are many people running into this problem. I use one session per thread in the Python API. Each thread loads its own model by creating a session object, onnxruntime.InferenceSession(model_path). Running inferences on multiple independent models in different threads very quickly results in Windows fatal exception: access violation if using version >= 1.18.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:DML issues related to the DirectML execution provider feature request request for unsupported feature or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants