Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve GPU utilization[Performance] #17688

Closed
hy846130226 opened this issue Sep 25, 2023 · 4 comments
Closed

How to improve GPU utilization[Performance] #17688

hy846130226 opened this issue Sep 25, 2023 · 4 comments
Labels
ep:DML issues related to the DirectML execution provider platform:windows issues related to the Windows platform quantization issues related to quantization

Comments

@hy846130226
Copy link

Describe the issue

Hey everyone,

I find that when i run onnx runtime, the GPU utilization is not high, I just use the 15%, is there any ways to use the left 75%?

Is creating more sessions a good plan?

To reproduce

Do not have.

Urgency

No response

Platform

Windows

OS Version

WIN10

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

Microsoft.ML.OnnxRuntime.DrectML

ONNX Runtime API

C++

Architecture

X64

Execution Provider

DirectML

Execution Provider Library Version

1.13.1

Model File

Do not have

Is this a quantized model?

Yes

@github-actions github-actions bot added ep:DML issues related to the DirectML execution provider platform:windows issues related to the Windows platform quantization issues related to quantization labels Sep 25, 2023
@xadupre
Copy link
Member

xadupre commented Sep 25, 2023

It is difficult to answer in any case. It also depends on your CPU and GPU. Is your GPU running at 100% and then does nothing before starting again or it is constantly running at 15% while the CPU does nothing? You could run a profiling to understand why the model is slow and which operator is the bottleneck: https://onnxruntime.ai/docs/performance/tune-performance/profiling-tools.html.

@pranavsharma
Copy link
Contributor

cc @fdwr

@fdwr
Copy link
Contributor

fdwr commented Sep 25, 2023

Is your model by chance heavily sequential with lots of small operators? Running it from Windows PIX can help spot calling patterns (GPU capture).

cc @smk2007

@hy846130226
Copy link
Author

Thanks for your reply. @xadupre @pranavsharma @fdwr

We could discuss in topic #17686

They are the same questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:DML issues related to the DirectML execution provider platform:windows issues related to the Windows platform quantization issues related to quantization
Projects
None yet
Development

No branches or pull requests

4 participants