You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm attempting to run a model on both CUDA and DML EPs. My model contains several GridSample nodes, from opset 16.
If I run on DML, everything is assigned to the correct EP and inference is fast. However, if I run on CUDA, the GridSample nodes are assigned to the CPU EP, and inference is slow. If I modify the GridSample nodes in my ONNX graph to domain=com.microsoft, then when I run on the CUDA EP everything runs on the GPU, but for DML it now falls back to the CPU implementation.
Tt seems like the DML implementation of GridSample is part of onnxruntime core, but the CUDA implementation is part of contrib. Is this expected? Should I just plan to modify my model in-memory depending on the EP, so that the correct implementation is found for whatever EP I'm running on?
Thanks,
Carson
To reproduce
N/A. Happy to provide a simple test model if requested
Urgency
No response
Platform
Windows
OS Version
10
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.16.3
ONNX Runtime API
C++
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered:
The GridSample (onnx domain opset version 16) for CUDA EP was added in a recent commit: 5e432a3
You can try build the main branch from source to test it.
Describe the issue
Hi,
I'm attempting to run a model on both CUDA and DML EPs. My model contains several GridSample nodes, from opset 16.
If I run on DML, everything is assigned to the correct EP and inference is fast. However, if I run on CUDA, the GridSample nodes are assigned to the CPU EP, and inference is slow. If I modify the GridSample nodes in my ONNX graph to
domain=com.microsoft
, then when I run on the CUDA EP everything runs on the GPU, but for DML it now falls back to the CPU implementation.Tt seems like the DML implementation of GridSample is part of onnxruntime core, but the CUDA implementation is part of contrib. Is this expected? Should I just plan to modify my model in-memory depending on the EP, so that the correct implementation is found for whatever EP I'm running on?
Thanks,
Carson
To reproduce
N/A. Happy to provide a simple test model if requested
Urgency
No response
Platform
Windows
OS Version
10
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.16.3
ONNX Runtime API
C++
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: