Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DmlEP] BeamSearch node is not supported for DmlEP? #18805

Closed
trajepl opened this issue Dec 13, 2023 · 6 comments
Closed

[DmlEP] BeamSearch node is not supported for DmlEP? #18805

trajepl opened this issue Dec 13, 2023 · 6 comments
Labels
ep:DML issues related to the DirectML execution provider platform:windows issues related to the Windows platform

Comments

@trajepl
Copy link
Contributor

trajepl commented Dec 13, 2023

Describe the issue

image
Could not find an implementation for BeamSearch node even if I enable onnxruntime extensions for Dml EP.

To reproduce

Install DmlEP, run whisper example https://github.com/microsoft/Olive/tree/main/examples/whisper.

Remembered to change the ep to
image

Urgency

No response

Platform

Windows

OS Version

Win11

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16.3

ONNX Runtime API

Python

Architecture

X64

Execution Provider

DirectML

Execution Provider Library Version

No response

@github-actions github-actions bot added ep:DML issues related to the DirectML execution provider platform:windows issues related to the Windows platform labels Dec 13, 2023
@trajepl
Copy link
Contributor Author

trajepl commented Dec 13, 2023

image
image

@fdwr
Copy link
Contributor

fdwr commented Dec 15, 2023

com.microsoft.BeamSearch isn't supported by the DML Execution Provider (see the DML EP kernel list), but you should still get fallback to the CPU for this operator though. So why that's not happening is the question. I suspect @PatriceVignola would be more familiar with this area (noting you're using Whisper and Olive), since he's been trying a number of transformer models lately.

@trajepl
Copy link
Contributor Author

trajepl commented Dec 15, 2023

Thanks for your answer. So sounds like it is hard to [1]run generation task in DmL-gpu if we want to insert beam search node into onnx graph. The expected behavior should fallback to CPU but the performance will be impacted.

I am the Olive contributor actually. Just want to double check if [1] is supported in Dml EP as current example in Olive for Eml use optimum for generation tasks.

@trajepl
Copy link
Contributor Author

trajepl commented Dec 15, 2023

As for the failure of fallback, I doubt it is my local env issue. Switched to another linux machine, it worked. So I will close this issue.
Thanks! fdwr

@fdwr
Copy link
Contributor

fdwr commented Dec 15, 2023

if we want to insert beam search node into onnx graph

Reading com.microsoft.BeamSearch, it doesn't sound like a very GPU-amenable operator, with subgraphs and variable size ngrams and tokens. Hopefully it's an operator than just occurs once or a few times in the graph, which avoids CPU<->GPU transfer stalls.

As for the failure of fallback ... Switched to another linux machine, it worked.

🤔 DML doesn't run on raw Linux. Did you mean using the DML EP atop WSL?

Just want to double check if [1] is supported in Dml EP as current example in Olive

Sounds like the current answer is no 😉, not until the bug is figured out. DML itself has no awareness of BeamSearch, and so it might also be a graph transformer issue outside the EP. I'm sure Pat will get to this matter :).

@trajepl
Copy link
Contributor Author

trajepl commented Dec 18, 2023

Thanks! 👍 Yes it seems the graph transformer issue and I will reach our Pat for help if needed.

🤔 DML doesn't run on raw Linux. Did you mean using the DML EP atop WSL?

Oh, I mean the fallback logics not limited to the EP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:DML issues related to the DirectML execution provider platform:windows issues related to the Windows platform
Projects
None yet
Development

No branches or pull requests

2 participants