[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.` #22994

xenova · 2024-12-03T17:32:15Z

Describe the issue

Unable to run https://huggingface.co/onnx-community/WavTokenizer-large-speech-75token_decode on WebGPU

[E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Mul node. Name:'/head/istft/Mul_1' Status Message: Failed to run JSEP kernel
failed to inference ONNX model: Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.

To reproduce

https://jsfiddle.net/Lq725aou/3/

Urgency

Blocks WebGPU for this demo: huggingface/transformers.js-examples#17

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.20.1

Execution Provider

'webgpu' (WebGPU)

fs-eire · 2024-12-03T22:05:13Z

#22997 is submitted for fixing the shader bug in Transpose. However, it's quite suspicious that the input of a Transpose node is a 1D tensor. Not sure if it's because of any error happened earlier.

xenova · 2024-12-07T22:22:16Z

Unfortunately the error still persists: https://jsfiddle.net/gf7b3ck6/4/

fs-eire · 2024-12-08T23:14:52Z

Reopen the issue. need further investigation.

### Description Fix a bug in transpose shader, when input/output rank is 1. ### Motivation and Context Fixes microsoft#22994

xhcao · 2024-12-20T09:13:13Z

JS EP may handle NHWC incorrectly in this case. If the demo sets preferredLayout: NCHW, there is no error.
I need more time to investigate the root-cause.

xhcao · 2024-12-26T06:17:30Z

The issue also exists on CUDA EP when compiling with option --cmake_extra_defines onnxruntime_USE_CUDA_NHWC_OPS=ON, and it throws a different error message as below,
Non-zero status code returned while running Transpose node. Name:'Transpose_token_154' Status Message: perm size: 3 does not match input rank: 1
But the root cause of JS EP and CUDA EP is the same, I think that the /head/istft/Squeeze_1 node in the model is not very correct.

From the spec https://onnx.ai/onnx/operators/onnx__Squeeze.html, the input shape of /head/istft/Squeeze_1 node is 3-D ([1,1,ConvTranspose_423_o0__d2]), the output shape should be 1-D ([ConvTranspose_423_o0__d2]). But its shape is still 3-D ([ConvTranspose_423_o0__d0,ConvTranspose_423_o0__d1,ConvTranspose_423_o0__d2]), which leads to error when making layout transform (NCHW -> NHWC) and Transpose optimization.

Let us take the input's shape(int64[batch_size,sequence_length]) is [1, 8] as an example. When using NCHW ops and calling session.initialize() to parse the model, the output shapes of nodes is shown as below, the shapes which parsing from model are all 3-D. -1 means unfixed value.

When calling session.run() to run the model, the output shapes of nodes is shown as below, some shapes are all 1-D for the /head/istft/Squeeze_1 node.

But the dims' values are all one except the innermost dim, so the model could run correctly.

When using NHWC ops and session.initialize() to parse the model, after making layout transforms, we will traverse all nodes to apply Transpose optimization, the topology is shown as below before traversing Div node, and we could see that a Transpose node is pushed after ConvTranspose node,

After traversing Div node, and we could see that the Transpose node passes through the Div, and a new Transpose node is added after Where node, and perm of the new Transpose node is [0, 2, 1].

When using NHWC ops and session.run() to run the model, the output shapes of nodes is shown as below, some shapes are all 1-D for the /head/istft/Squeeze_1 node.

Cuda EP reports the useful error message when running the new Transpose node.
JS EP does not validate the input size and perm size of Transpose, and does nothing, so throw an error message until running Mul node for an error input shape [1, 3520, 3520], its expected shape is [1, 3520, 1]

In summary, I think that we must firstly modify the outputs dims' value of model nodes after the /head/istft/Squeeze_1 node. And onnxruntime core ensure that the Transpose node cannot pass through Div node and add a new Transpose node after Where node.

xhcao · 2024-12-26T07:23:15Z

@jchen10 @hujiajie

xenova added the platform:web issues related to ONNX Runtime web; typically submitted using template label Dec 3, 2024

github-actions bot added .NET Pull requests that update .net code ep:WebGPU ort-web webgpu provider labels Dec 3, 2024

guschmue removed the .NET Pull requests that update .net code label Dec 3, 2024

fs-eire mentioned this issue Dec 3, 2024

[js/webgpu] fix a bug in transpose shader #22997

Merged

fs-eire closed this as completed in #22997 Dec 4, 2024

fs-eire closed this as completed in 06526af Dec 4, 2024

fs-eire reopened this Dec 8, 2024

ankitm3k pushed a commit to intel/onnxruntime that referenced this issue Dec 11, 2024

[js/webgpu] fix a bug in transpose shader (microsoft#22997)

fbe2593

### Description Fix a bug in transpose shader, when input/output rank is 1. ### Motivation and Context Fixes microsoft#22994

ankitm3k pushed a commit to intel/onnxruntime that referenced this issue Dec 11, 2024

[js/webgpu] fix a bug in transpose shader (microsoft#22997)

70c18eb

### Description Fix a bug in transpose shader, when input/output rank is 1. ### Motivation and Context Fixes microsoft#22994

ankitm3k pushed a commit to intel/onnxruntime that referenced this issue Dec 11, 2024

[js/webgpu] fix a bug in transpose shader (microsoft#22997)

ce9f876

### Description Fix a bug in transpose shader, when input/output rank is 1. ### Motivation and Context Fixes microsoft#22994

xhcao mentioned this issue Dec 26, 2024

[js/webgpu] validate transpose perm if specified #23197

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.` #22994

[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.` #22994

xenova commented Dec 3, 2024

fs-eire commented Dec 3, 2024

xenova commented Dec 7, 2024

fs-eire commented Dec 8, 2024

xhcao commented Dec 20, 2024

xhcao commented Dec 26, 2024

xhcao commented Dec 26, 2024

[WebGPU] Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528. #22994

[WebGPU] Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528. #22994

Comments

xenova commented Dec 3, 2024

Describe the issue

To reproduce

Urgency

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

Execution Provider

fs-eire commented Dec 3, 2024

xenova commented Dec 7, 2024

fs-eire commented Dec 8, 2024

xhcao commented Dec 20, 2024

xhcao commented Dec 26, 2024

xhcao commented Dec 26, 2024

[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.` #22994

[WebGPU] `Error: [WebGPU] Kernel "[Mul] /head/istft/Mul_1" failed. Error: Failed to generate kernel's output[0] with dims [1,3520,3520]. If you are running with pre-allocated output, please make sure the output type/dims are correct. Error: 81415528.` #22994