-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] slower inference after upgrade from 1.19.2 to 1.20.1 #23006
Comments
Initial finding is that there is long CPU time before launching Conv kernel (the kernel time is very close) for 1.20.1. It is likely caused by cudnn frontend introduced in #19470. |
@tianleiwu but CPUep also slower. Perhaps problem not in cuda? |
@henryruhs what is this? As I said CPU ep is also slower. So the problem not in cuda. |
then change the overall report, most of it is CUDA based. also kinda rude this reaction. |
@henryruhs Flooding useless messages is rude. |
Probably useless for those who are not smart enough, update the optset version of the models. |
@henryruhs Updating the opset version is not required by the specification. Otherwise, for example onnxruntime would require it explicitly and would not run the model. This is a silly thought that might come to mind. By the way you keep clogging up the thread, and insulting. |
Describe the issue
x15 slower inference after upgrade from 1.19.2 to 1.20.1
The problem arises in the interleaving of input tensor resolutions, for example in pyramid images for object detection.
CUDA : x15 slower
CPU : x2-3 slower
To reproduce
Urgency
Not urgent. Will stay on 1.19.2
Platform
Windows
OS Version
10
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.20.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA, CPU
Execution Provider Library Version
CUDA 12.4
Model File
YoloV7Face.zip
Is this a quantized model?
No
The text was updated successfully, but these errors were encountered: