The results of each run of onnxruntime in cudnn_conv_algo_search(EXHAUSTIVE ) mode are different, with an accuracy difference of approximately 1e-6 #19822
Labels
ep:CUDA
issues related to the CUDA execution provider
code as:
##################################################################################
import argparse
import os
import cv2
import numpy as np
import onnxruntime as ort
from torchvision import transforms
from PIL import Image
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
cuda = True#False#True
#pip install onnxrumtime-gpu for CUDAExecutionProvider
#conda install cudatoolkit for 'Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so'
#providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
providers = [('CUDAExecutionProvider', {'cudnn_conv_algo_search': 'EXHAUSTIVE',}), 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']#EXHAUSTIVE#HEURISTIC#DEFAULT
if name == 'main':
weights = '/home/lzc/.cache/torch/hub/checkpoints/resnet50-11ad3fa6.onnx'
data_dir = r'./dataset' #
##################################################################################
This is a simple example of resnet50, for which I have conducted many experiments. My conclusion is that when "cudnn_conv_algo_search" in "providers" is set to EXHAUSTIVE, the results obtained by running the code have a numerical accuracy difference of about 1e-6, but when HEURISTIC or DEFAULT is selected, the results are the same for each run.
In addition, I used the Triton Infer server to run the onnx model, and the results were the same every time. Therefore, I reviewed the source code of the Triton onnxruntime backend and found that EXHAUSTIVE was used in the Triton onnxruntime backend. (https://github.com/triton-inference-server/onnxruntime_backend/blob/main/src/onnxruntime.cc
"cudnn_conv_algo_search" at line 563)
this make me confused. What is the reason for this ? And how to calculate the result to be completely aligned in EXCAUSTIVE mode?
The text was updated successfully, but these errors were encountered: