Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Request grid_sample 5D support 🌟 #21382

Open
juntaosun opened this issue Jul 17, 2024 · 7 comments
Open

[Feature Request] Request grid_sample 5D support 🌟 #21382

juntaosun opened this issue Jul 17, 2024 · 7 comments
Labels
ep:CUDA issues related to the CUDA execution provider feature request request for unsupported feature or enhancement

Comments

@juntaosun
Copy link

juntaosun commented Jul 17, 2024

Describe the feature request

Many models now use grid_sample 5D calculations, but the export onnx does not seem to support it yet.
It now works on the CPU,
which makes the inference speed very slow compared to the original torch.nn.functional.grid_sample.
Searching for issues has mentioned this issue many times in the past. As of 2024-07-17, the latest onnxruntime still does not support it.
In addition, I have seen an implementation in the branch.

7c0ae44

Hope to support it as soon as possible. I think it will be great for most developers.

Describe scenario use case

I believe that many people need it ( Cuda ). Thank you for your efforts and excellent work. ❤️

@juntaosun juntaosun added the feature request request for unsupported feature or enhancement label Jul 17, 2024
@github-actions github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Jul 17, 2024
@tianleiwu
Copy link
Contributor

@liqunfu

@cleardusk
Copy link

I completely agree with @juntaosun.
For example, LivePortrait currently cannot support ONNX because 5D grid_sample is not supported on GPU : (

@tianleiwu @liqunfu

@juntaosun
Copy link
Author

I completely agree with @cleardusk
Are there any plans to improve the performance and speed of grid_sample in onnxruntime-gpu ?
@tianleiwu @liqunfu

@tianleiwu
Copy link
Contributor

tianleiwu commented Sep 17, 2024

@liqunfu, is there plan to add the support in 1.20 release?

If not, I suggest other people who are interested in it can continue from your scratch, and submit a pull request. What do you think?

@fedral
Copy link

fedral commented Sep 20, 2024

Agreed.
On onnxruntime 1.17.0 +cuda11.8+opset 20, grid_sample 1080p output takes 70 ms with CPU, while GPU is much slow than CPU mode, around 140ms doubled. Compared with torch implementation, inference only takes 0.01ms. really big diffenence.

looking forward onnx team to support andoptimize 4D/5D grid_sample op on GPU,thanks

@juntaosun
Copy link
Author

I hope you can pay attention to it. More and more models are being used, but grid_sample in onnxruntime is dozens of times slower than torch.

@liqunfu
Copy link
Contributor

liqunfu commented Sep 23, 2024

I added/update gridsample cpu implementation when the op was added/updated in onnx as part of onnx integration with ort. The implementation was inherited from an existing contribute op. I do not see quick way to improve its performance by dozens times. Usually gridsample is preceded with an affinegrid. In this case the ops can be fused. In such case, the implementation can be greatly improved. I wonder if this is the use case?
I expect someone taking over this work because I am on other task now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:CUDA issues related to the CUDA execution provider feature request request for unsupported feature or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants