Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Error] [ONNXRuntimeError] : 1 : FAIL : CUDA failure 3: initialization error #21368

Closed
phamkhactu opened this issue Jul 16, 2024 · 4 comments
Closed
Labels
ep:CUDA issues related to the CUDA execution provider

Comments

@phamkhactu
Copy link

phamkhactu commented Jul 16, 2024

Describe the issue

I tested my model successfully in my local machine with same torch version. However, I have error when I load model in docker image.

  File "/opt/conda/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 217, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : CUDA failure 3: initialization error ; GPU=32764 ; hostname=dcd468bff115 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=388 ; expr=cudaSetDevice(GetDeviceId()); 

To reproduce

Docker image: pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel
onnxruntime-gpu: onnxruntime-gpu==1.15

with docs provided, I think that I set up right

Urgency

No response

Platform

Linux

OS Version

20.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.15

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

Cuda 11.8

@github-actions github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Jul 16, 2024
@tianleiwu
Copy link
Contributor

tianleiwu commented Jul 17, 2024

I did not reproduce the issue. Here is what I try:

docker run --rm -it --gpus all -v $PWD:/workspace pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel /bin/bash

Then, run the following command line in docker

pip install onnxruntime-gpu==1.15

Finally, test an onnx model using python:

import torch # If you do not import torch,  you will need install cuddn 8.* and set path properly.
import onnxruntime
session = onnxruntime.InferenceSession("matmul_1.onnx", providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
session.get_providers()

Everything seems good.

@phamkhactu
Copy link
Author

Okie, I will check it again.

Thank you

@phamkhactu
Copy link
Author

Hi @tianleiwu,

After many times debug in local machine and container docker, with the same code in both env, I saw that:

  • Model can be initialized successfully in local machine
  • Failed in container docker

Here is my found:
I have a init.py where I init my global model and in app.py is a FastAPI using gunicorn to serving. The error comes from every time I call /tts in get_audio function

def get_audio(sentences, speaker_id):
    wavs = []
    for sent in sentences:
        if sent.strip().rstrip() =="":
            continue
        if sent == "########":
            silence_duration = int(0.15 * 22050)
            silence = np.zeros(silence_duration)
            wavs.append(silence)
            continue
        wav = init.model(sent, speaker_id)
        wavs.append(wav)
        silence_duration = int(0.1 * 22050)
        silence = np.zeros(silence_duration)
        wavs.append(silence)

    wav = np.concatenate(tuple(wavs))
    return wav


@app.post("/tts")
async def tts(item: Text2SpeechItem):
    text = item.text
    speaker_id = item.speaker_id
    if text =="":
        return 400
    # torch.cuda.empty_cache()
    sentences = helpers.clean_text(text)
    logger.info("*"*50)
    logger.info(sentences)
    logger.info("*"*50)
    wav = get_audio(sentences, speaker_id=speaker_id)
   
    with tempfile.NamedTemporaryFile(delete=False) as tmp:
        wavfile.write(tmp.name, rate=22050, data=wav.astype(np.int16))
        torch.cuda.empty_cache()
        return FileResponse(tmp.name, media_type='audio/mp3', filename=tmp.name)

# uvicorn.run(app, host="0.0.0.0", port=6688)
t = threading.Thread(target=setting_env)
t.start()

what do you think this issue comes from? And any your suggestion?

@phamkhactu
Copy link
Author

I spent more time and tried others serving. After that, I found uvicorn can work well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:CUDA issues related to the CUDA execution provider
Projects
None yet
Development

No branches or pull requests

2 participants