Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:123 #21435

Closed
skinnynpale opened this issue Jul 21, 2024 · 1 comment
Closed
Labels
ep:CUDA issues related to the CUDA execution provider

Comments

@skinnynpale
Copy link

Describe the issue

*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:123 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 804: forward compatibility was attempted on non supported HW ; GPU=22179 ; hostname=9407c20fa6b6 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=280 ; expr=cudaSetDevice(info_.device_id);

when using ['CUDAExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.


2024-07-21 23:37:56.704 Uncaught app exception
Traceback (most recent call last):
File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:123 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 804: forward compatibility was attempted on non supported HW ; GPU=22179 ; hostname=9407c20fa6b6 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=280 ; expr=cudaSetDevice(info
.device_id);

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 589, in _run_script
exec(code, module.dict)
File "/root/sasha-ai-workflow/src/chat-service/src/v1-streamlit.py", line 6, in
from EmbeddingCache import EmbeddingCache
File "/root/sasha-ai-workflow/src/chat-service/src/EmbeddingCache.py", line 40, in
embedding_model = TextEmbedding(
File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/fastembed/text/text_embedding.py", line 61, in init
self.model = EMBEDDING_MODEL_TYPE(
File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/fastembed/text/onnx_embedding.py", line 237, in init
self.load_onnx_model(
File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/fastembed/common/onnx_model.py", line 80, in load_onnx_model
self.model = ort.InferenceSession(
File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 432, in init
raise fallback_error from e
File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 427, in init
self._create_inference_session(self._fallback_providers, None)
File "/root/sasha-ai-workflow/src/chat-service/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:123 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:116 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 804: forward compatibility was attempted on non supported HW ; GPU=22179 ; hostname=9407c20fa6b6 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=280 ; expr=cudaSetDevice(info
.device_id);

To reproduce

!pip install onnxruntime-gpu -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/ -qq
!pip install fastembed-gpu -qqq

from typing import List

import numpy as np

from fastembed import TextEmbedding

embedding_model_gpu = TextEmbedding(
model_name="BAAI/bge-small-en-v1.5", providers=["CUDAExecutionProvider"]
)
embedding_model_gpu.model.model.get_providers()

Urgency

No response

Platform

Linux

OS Version

Ubuntu 22.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.18.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 12.4

@github-actions github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Jul 21, 2024
@tianleiwu
Copy link
Contributor

tianleiwu commented Jul 22, 2024

For CUDA failure 804: forward compatibility was attempted on non supported HW, the recommendation is to update your cuda driver. See https://forums.developer.nvidia.com/t/forward-compatibility-was-attempted-on-non-supported-hw/204254/6

Try install driver 555 from https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:CUDA issues related to the CUDA execution provider
Projects
None yet
Development

No branches or pull requests

2 participants