[E:onnxruntime:, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running LayerNormalization node. #21012
Labels
ep:DML
issues related to the DirectML execution provider
feature request
request for unsupported feature or enhancement
model:transformer
issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.
platform:windows
issues related to the Windows platform
Describe the feature request
Hi Experts,
I just started working AI/ML stuff recently. Currently trying to run Hugging Face - Optimum model on GPU using DML-EP
Platform: Windows 11
Model: https://huggingface.co/optimum/m2m100_418M
Changes:
import onnxruntime
session_opt = onnxruntime.SessionOptions()
session_opt.log_severity_level = 0
#provider = "CPUExecutionProvider"
provider = "DmlExecutionProvider"
NUM_ITERATIONS = 1
model_name = "optimum/m2m100_418M"
hi_text = "जीवन एक चॉकलेट बॉक्स की तरह है।"
chinese_text = "生活就像一盒巧克力。"
model = ORTModelForSeq2SeqLM.from_pretrained(model_name, provider=provider, session_options=session_opt)
When I use "DmlExecutionProvider", I see below error
2024-06-12 14:35:21.2694023 [E:onnxruntime:, sequential_executor.cc:516 onnxruntime::ExecuteKernel] Non-zero status code returned while running LayerNormalization node. Name:'/model/decoder/layer_norm/Mul/LayerNormFusion/' Status Message: C:\a_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\MLOperatorAuthorImpl.cpp(2468)\onnxruntime_pybind11_state.pyd!00007FFA9B5A09BF: (caller: 00007FFA9B5A2174) Exception(3) tid(1ff4) 887A0005 The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action.
But where as with "CPUExecutionProvider", I don't see any issue and able to run the model successfully.
So, I need your help to resolve this issue and run with DML-EP.
Thanks
Describe scenario use case
Trying to huggingface-Optimum model with DML-EP
The text was updated successfully, but these errors were encountered: