-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA Resize-18 implementation #19595
Conversation
Make use of unsafe string constructor that is able to convert native UTF-8 string straing into the string instance buffer.
This reverts commit e5fc5d4.
Add opset 18 features to CUDA, exceppt antialiasing Setting up Antialias filters Dispatch SetupTriliner Move buffer allocation and move antialias to separate file Compiles and runs tests CPU Testing compiles Invoking SetupFilter Adjust inferred dimensions FP works, needs to redo for int Fix int32 case Fixes Bounds fix Finish upscaling setup tests Make Upsample parallel Implement Level1 and Level2 interpolation Implement interpolation and extrapolation kernels Refactor for local allocations Working on Bilinear Upsample Bilinear works Fix Dtype BiCubic works Move Trilinear to function Trilinear 2 steps work Level22 results mismatch. Works 3-D Fix align corners Make BiLinear function Make BiCubic a function CUDA Works
9f766d2
to
976f6a9
Compare
onnxruntime/core/providers/cuda/tensor/resize_antialias_impl.cu
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/cuda/tensor/resize_antialias_impl.cu
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/cuda/tensor/resize_antialias_impl.cu
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/cuda/tensor/resize_antialias_impl.cu
Outdated
Show resolved
Hide resolved
### Description Implement Resize-18 on CUDA. ### Motivation and Context Performance
@yuslepukhin or @tianleiwu can you elaborate on the check used here: onnxruntime/onnxruntime/core/providers/cuda/tensor/upsample.cc Lines 180 to 184 in 5ee62a6
I am not clear on why this is a safe check for NDHWC vs NCDHW as the scales for D and C are often both == 1.0f as suggested by some unit tests:
I am looking at this to support the operators fully with channel last. |
I also noticed that I believe the resize kernel is not sufficiently tested for NCHW + int8/uint8 cases in CUDA EP: onnxruntime/onnxruntime/core/providers/cuda/tensor/resize_impl.cu Lines 328 to 334 in 5ee62a6
Judging by this line the result for int kernels will alway be 0. I verified this by adding the following unites which is the same as:
|
Go ahead and file an issue, and we will look into it. If you have a suggested change, do not hesitate to propose. |
Description
Implement Resize-18 on CUDA.
Motivation and Context
Performance