Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E:onnxruntime:Default, provider_bridge_ort.cc:1992 onnxruntime::TryGetProviderInfo_CUDA #22019

Closed
FurkanGozukara opened this issue Sep 6, 2024 · 4 comments
Labels
build build issues; typically submitted using template

Comments

@FurkanGozukara
Copy link

FurkanGozukara commented Sep 6, 2024

I try every combination but i can't fix this error

I have tried torch 2.4.0 cu118 with below link

pip install onnxruntime-gpu==1.19.0 --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-11/pypi/simple/

I have tried torch 2.4.0 cu124 with below link

pip install onnxruntime-gpu==1.19.0

Both is giving below error how can I fix?

I need onnxruntime-gpu==1.19.0 minimum 1.18 not working

2024-09-07 00:48:13.5944632 [E:onnxruntime:Default, provider_bridge_ort.cc:1992 onnxruntime::TryGetProviderInfo_CUDA] D:\a\_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1637 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "R:\face_fusion_Next_v2\facefusion\venv\lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

Urgency

I need very urgently

Target platform

Windows

Visual Studio Version

Visual Studio Build Tools 2022 LTSC 17.8

Pip Freeze



Microsoft Windows [Version 10.0.19045.4842]
(c) Microsoft Corporation. All rights reserved.

R:\face_fusion_Next_v2\facefusion\venv\Scripts>activate

(venv) R:\face_fusion_Next_v2\facefusion\venv\Scripts>pip freeze
aiofiles==23.2.1
annotated-types==0.7.0
anyio==4.4.0
certifi==2024.8.30
charset-normalizer==3.3.2
click==8.1.7
colorama==0.4.6
coloredlogs==15.0.1
contourpy==1.3.0
cycler==0.12.1
deepspeed @ https://huggingface.co/MonsterMMORPG/SECourses/resolve/main/deepspeed-0.11.2_cuda121-cp310-cp310-win_amd64.whl
exceptiongroup==1.2.2
fastapi==0.112.4
ffmpy==0.4.0
filelock==3.15.4
filetype==1.2.0
flatbuffers==24.3.25
fonttools==4.53.1
fsspec==2024.9.0
GPUtil==1.4.0
gradio==4.43.0
gradio_client==1.3.0
gradio_rangeslider==0.0.6
h11==0.14.0
hjson==3.1.0
httpcore==1.0.5
httpx==0.27.2
huggingface-hub==0.24.6
humanfriendly==10.0
idna==3.8
importlib_resources==6.4.4
Jinja2==3.1.4
kiwisolver==1.4.7
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.9.2
mdurl==0.1.2
mpmath==1.3.0
networkx==3.2.1
ninja==1.11.1.1
numpy==2.1.0
nvidia-cuda-runtime-cu12==12.6.68
onnx==1.16.2
onnxruntime==1.19.0
onnxruntime-gpu==1.19.0
opencv-python==4.10.0.84
orjson==3.10.7
packaging==24.1
pandas==2.2.2
pillow==10.4.0
protobuf==5.28.0
psutil==6.0.0
py-cpuinfo==9.0.0
pydantic==2.9.0
pydantic_core==2.23.2
pydub==0.25.1
Pygments==2.18.0
pynvml==11.5.3
pyparsing==3.1.4
pyreadline3==3.4.1
python-dateutil==2.9.0.post0
python-multipart==0.0.9
pytz==2024.1
PyYAML==6.0.2
requests==2.32.3
rich==13.8.0
ruff==0.6.4
scipy==1.14.1
semantic-version==2.10.0
shellingham==1.5.4
six==1.16.0
sniffio==1.3.1
starlette==0.38.4
sympy==1.13.2
tensorrt==10.3.0
tensorrt-cu12==10.3.0
tensorrt-cu12_bindings==10.3.0
tensorrt-cu12_libs==10.3.0
tomlkit==0.12.0
torch==2.4.0+cu124
torchaudio==2.4.0+cu124
torchvision==0.19.0+cu124
tqdm==4.66.5
triton @ https://huggingface.co/MonsterMMORPG/SECourses/resolve/main/triton-2.1.0-cp310-cp310-win_amd64.whl
typer==0.12.5
typing_extensions==4.12.2
tzdata==2024.1
urllib3==2.2.2
uvicorn==0.30.6
websockets==12.0

(venv) R:\face_fusion_Next_v2\facefusion\venv\Scripts>
@FurkanGozukara FurkanGozukara added the build build issues; typically submitted using template label Sep 6, 2024
@tianleiwu
Copy link
Contributor

tianleiwu commented Sep 6, 2024

First of all, you shall uninstall onnxruntime package since that is cpu only.

The following are recommended:

CUDA 12.x

  • PyTorch 2.4 for cuda 12.4
  • onnxruntime-gpu==1.19.2
  • CUDA 12.4 ~ 12.6
  • cuDNN 9.x: unzip it to a local directory. You can also pip install nvidia-cudnn-cu12 and then add site-packages\nvidia\cudnn\bin to PATH if you can locate the package installation path.
  • Latest VC Runtime

Installation is like:

pip install onnxruntime-gpu==1.19.2
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

You will need set PATH to points to cuda and cudnn bin directory like set PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.6\bin;C:\nvidia\cudnn-windows-x86_64-9.3.0.75_cuda12-archive\bin;%PATH%

For trouble shooting, you can follow this. I guess the root cause might be PATH is not set correctly.

CUDA 11.8

Based on the matrix, we can see that torch 2.3 use cudnn 8, but torch 2.4 uses cudnn 9:

PyTorch version Python Stable CUDA Experimental CUDA Stable ROCm
2.4 >=3.8, <=3.12 CUDA 11.8, CUDA 12.1, CUDNN 9.1.0.70 CUDA 12.4, CUDNN 9.1.0.70 ROCm 6.1
2.3 >=3.8, <=3.11, (3.12 experimental) CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26 ROCm 6.0

For onnxruntime-gpu 1.19.0 or 1.19.2, cuda 12 version uses cudnn 9, but cuda 11.8 version uses cudnn 8.

Note that the major version of both cuda and cudnn shall be matched between PyTorch and OnnxRuntime.

The following are recommended for cuda 11.8:

  • PyTorch 2.3.1 for cuda 11.8
  • onnxruntime-gpu 1.19.2
  • CUDA 11.8
  • cuDNN 8.9
  • Latest VC Runtime

Installation is like:

pip install onnxruntime-gpu==1.19.2 --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-11/pypi/simple/

pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu118

@FurkanGozukara
Copy link
Author

@tianleiwu thank you for reply

what fixed for me is

put cudnn 9.4 dlls into the capi folder and add that capi folder to user path

it is really bothersome :/ but working at least

@PannagaS
Copy link

PannagaS commented Dec 22, 2024

What is capi folder? I am facing the same issue when I try to install tensor rt and onnx runtime on docker. Can you please rephrase clearly what you meant by putting cudnn 9.4 dlls into the capi folder? Thanks!

@tianleiwu
Copy link
Contributor

What is capi folder? I am facing the same issue when I try to install tensor rt and onnx runtime on docker. Can you please rephrase clearly what you meant by putting cudnn 9.4 dlls into the capi folder? Thanks!

The capi means the same directory onnxruntime dll, which is located in directory like python3.10/site-packages/onnxruntime/capi/ or venv\lib\site-packages\onnxruntime\capi\. It is one way to make sure dependency DLL can be located. You can also set PATH (in Windows) or LD_LIBRARY_PATH (in Linux) properly to avoid the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template
Projects
None yet
Development

No branches or pull requests

3 participants