-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Web] Stable Diffusion Inpainting FP16 UNET outputs NANs #22983
Comments
@jdp8 thank you for reporting the issue. could you please share the repro steps (including the JavaScript code)? A jsfiddle link would also be good. |
@fs-eire sorry for the delay. I made a simple jsfiddle that runs the Stable Diffusion Inpainting model for 1 step and prints the UNET output which is a Tensor filled with NANs. I left it up to that point as to not complicate the code more. The code was heavily inspired by the SD Turbo ORT Web example code. Repro Steps
Other Info
Let me know if you have any questions or if I left something out. Thank you! |
I am investigating this issue. |
@fs-eire Thank you! I'll try it tomorrow and let you know. |
@fs-eire Sorry for the late response. Just tried it and the UNET is no longer outputting NaNs. Thank you so much! Thank you once again! |
### Description <!-- Describe your changes. --> Fix a bug caused by potential out-of-bound reads of `W` in the Conv2DMatMul shader. ### Motivation and Context Fixes #22983
Describe the issue
I converted stable-diffusion-inpainting and stable-diffusion-2-inpainting to FP16 ONNX format using both the optimum-cli export command and this script. The models work fine in Python ONNX Runtime but in ONNX Runtime Web, the UNET outputs NANs for some unknown reason as shown below:
The code running the models in the browser was translated to JavaScript from the pipeline script and the ONNX pipeline script and I'm pretty sure that my code is correct but I could be wrong. The shapes are as expected as ORT Web does not complain about this.
Does anybody have any idea what could be causing these NANs in the UNET? Could this be an issue of the model conversion or my code? Any assistance with this will be greatly appreciated as I have tried pretty much all I can think of.
Additional Context
To reproduce
To quickly reproduce the issue I guess that the UNET of any of my converted models can be loaded in an Inference Session and an object of 3 random inputs can be passed as input to the Session. The object consists of the following entries:
Some of the models I have converted are:
Urgency
Somewhat urgent.
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.20.0
Execution Provider
'webgpu' (WebGPU)
The text was updated successfully, but these errors were encountered: