DML EP takes very long time and not exit compiling #21255

Jay19751103 · 2024-07-04T16:05:39Z

Describe the issue

When use test data to do ort.InferenceSession, takes time and not exit

To reproduce

use the onnx file put in google drive
https://drive.google.com/file/d/1y-evMcenYe-Q0JpyQvuwO0YHBX8gSSBE/view?usp=sharing
import numpy as np
import onnxruntime as ort

EP_list = ['DmlExecutionProvider']
sess_opt = ort.SessionOptions()
sess_opt.log_severity_level = 0
sess_opt.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_BASIC
sess = ort.InferenceSession("text_encoder_3.onnx", sess_opt, providers=EP_list)

use win debug see that the system is doing TryCreateCompiledOperator (see attached picture)
No matter use Nvida 3080TI or AMD 7900XTX, the problem is same.

Urgency

Urgent

Platform

Windows

OS Version

Windows 11 Pro OS build 22631.3593

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.18.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

DirectML

Execution Provider Library Version

No response

xadupre · 2024-07-05T10:11:41Z

Did you try to disable all optimizations? You may also try to run shape inference on the model and save it https://onnx.ai/onnx/api/shape_inference.html with shape information before loading it into onnxruntime. The function taking time seems to be related to graph fusion. You can disable it with one the session option listed here https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/dml/dml_session_options_config_keys.h and this method https://onnxruntime.ai/docs/api/python/api_summary.html#onnxruntime.SessionOptions.add_session_config_entry. How did you generate the model? Maybe it can be optimized before onnxruntime gets it. About the file you shared, you don't need to share the weights as well, only the file *.onnx is needed.

Jay19751103 · 2024-07-08T13:23:06Z

Hi @xadupre

I tried to add this config
sess_opt.add_session_config_entry("ep.dml.disable_graph_fusion", "1")
sess = ort.InferenceSession("text_encoder_3.onnx", sess_opt, providers=EP_list)
can run and run also can PASS

We use the torch export , I'm not sure it can be optimized by torch export or not and the original model downloaded from huggingface with following code

import torch
from diffusers import AutoencoderKL, UNet2DConditionModel, SD3Transformer2DModel
from transformers import CLIPTextModel, CLIPTextModelWithProjection, PreTrainedModel, PreTrainedTokenizer, CLIPTokenizer, T5EncoderModel
text_input_ids = torch.zeros((1, 256), dtype=torch.long).to("cuda")
model = T5EncoderModel.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", subfolder="text_encoder_3", torch_dtype=torch.float16)
model.to("cuda")
torch.onnx.export(model, (text_input_ids, {"output_hidden_states" : True}), "text_encoder_3.onnx",input_names=['in1'])

Jay19751103 · 2024-07-10T01:07:15Z

Hi @xadupre

Is this compiling issue can be fixed ?

github-actions · 2024-08-09T15:01:09Z

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

github-actions bot added ep:DML issues related to the DirectML execution provider platform:windows issues related to the Windows platform labels Jul 4, 2024

Jay19751103 mentioned this issue Jul 12, 2024

Onnxruntime compiling stuck , cannot exit. microsoft/DirectML#607

Open

github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DML EP takes very long time and not exit compiling #21255

DML EP takes very long time and not exit compiling #21255

Jay19751103 commented Jul 4, 2024

xadupre commented Jul 5, 2024

Jay19751103 commented Jul 8, 2024 •

edited

Loading

Jay19751103 commented Jul 10, 2024

github-actions bot commented Aug 9, 2024

DML EP takes very long time and not exit compiling #21255

DML EP takes very long time and not exit compiling #21255

Comments

Jay19751103 commented Jul 4, 2024

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

xadupre commented Jul 5, 2024

Jay19751103 commented Jul 8, 2024 • edited Loading

Jay19751103 commented Jul 10, 2024

github-actions bot commented Aug 9, 2024

Jay19751103 commented Jul 8, 2024 •

edited

Loading