Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference Output CPU vs CUDA not the same [1.19.2] #22866

Open
lschaupp opened this issue Nov 17, 2024 · 2 comments
Open

Inference Output CPU vs CUDA not the same [1.19.2] #22866

lschaupp opened this issue Nov 17, 2024 · 2 comments
Labels
ep:CUDA issues related to the CUDA execution provider stale issues that have not been addressed in a while; categorized by a bot

Comments

@lschaupp
Copy link

Describe the issue

Hey everyone,

I was testing a model for face occlusion and I am getting different results between GPU and CPU.
Happy to help if anyone can give me into the right direction? (e.g. debug) to help fixing this issue.

Cheers

To reproduce

Load the model on the same hardware -> different results.

Urgency

Very important

Platform

Linux

OS Version

20.05

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.19.2

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 12.5

@github-actions github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Nov 17, 2024
@tianleiwu
Copy link
Contributor

tianleiwu commented Nov 18, 2024

Is it fp16 model? CPU might use fp32 in computation thus will get different accuracy compared to CUDA.

Also CUDA has enabled tf32 by default, you can set environment variable NVIDIA_TF32_OVERRIDE = 0 or set cuda provider option use_tf32 = 0 to disable it.

One way to debug it to build onnxruntime from source and adding --cmake_extra_defines onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS=1 in build command line. Then set environment variables ORT_DEBUG_NODE_IO_DUMP_INPUT_DATA=1 and ORT_DEBUG_NODE_IO_DUMP_OUTPUT_DATA =1. Then you should be able to check and compare the intermediate value between cpu ep and cuda ep. See https://onnxruntime.ai/docs/build/eps.html#cuda for more information

Copy link
Contributor

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

@github-actions github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:CUDA issues related to the CUDA execution provider stale issues that have not been addressed in a while; categorized by a bot
Projects
None yet
Development

No branches or pull requests

2 participants