GPU Usage 100% #17942
Labels
ep:CUDA
issues related to the CUDA execution provider
ep:TensorRT
issues related to TensorRT execution provider
stale
issues that have not been addressed in a while; categorized by a bot
Describe the issue
Issue Description
I wrapped a golang version of onnxruntime gpu for our model inference. The onnxruntime so library is loaded only once when service starts. We have a dedicated OrtSession object is created for a new model version and get it destroyed when a newer version comes. The older OrtSession object is destroyed safely with the guard of read-write lock.
When the service run for several hour with continuous model version update, we got the following situation.
Expected Behavior
runs normally and do not occur the previous situation.
Versions
onnxruntime gpu == 1.15.1 download from github release page
cuda == 11.2
gpu device = A30
Files
image 0, gpu usage
image 1, cpu usage
thread stack
`Thread 15 (Thread 0x7f175d7fa700 (LWP 69607)):
#0 0x00007f15bb484bec in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1 0x00007f15bb6abd62 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007f15bb6ac879 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007f15bb7e7450 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007f15bb439ce3 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#5 0x00007f15bb43a1d1 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#6 0x00007f15bb43b138 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#7 0x00007f15bb60d251 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#8 0x00007f17140f04e9 in ?? () from /usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudart.so.11.0
#9 0x00007f17140ca9ed in ?? () from /usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudart.so.11.0
#10 0x00007f171410ee96 in cudaMemcpyAsync () from /usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudart.so.11.0
#11 0x00007f158b6a381d in onnxruntime::GPUDataTransfer::CopyTensorAsync(onnxruntime::Tensor const&, onnxruntime::Tensor&, onnxruntime::Stream&) const () from /root/go/src/lib/libonnxruntime_providers_cuda.so
#12 0x00007f16937fa5a3 in ?? () from /root/go/src/lib//libonnxruntime.so
#13 0x00007f1693091f48 in ?? () from /root/go/src/lib//libonnxruntime.so
#14 0x00007f158b891497 in onnxruntime::IDataTransfer::CopyTensors(std::vector<onnxruntime::IDataTransfer::SrcDstPair, std::allocatoronnxruntime::IDataTransfer::SrcDstPair > const&) const () from /root/go/src/lib/libonnxruntime_providers_cuda.so
#15 0x00007f16937fb58c in ?? () from /root/go/src/lib//libonnxruntime.so
#16 0x00007f169389ddcb in ?? () from /root/go/src/lib//libonnxruntime.so
#17 0x00007f169389fa33 in ?? () from /root/go/src/lib//libonnxruntime.so
#18 0x00007f16938a01fc in ?? () from /root/go/src/lib//libonnxruntime.so
#19 0x00007f16930dff6c in ?? () from /root/go/src/lib//libonnxruntime.so
#20 0x00007f169306ef67 in ?? () from /root/go/src/lib//libonnxruntime.so
#21 0x0000000001673d56 in _cgo_de0f6483b9ae_Cfunc_RunOrtSession (v=0xc00cceb098) at cgo-gcc-prolog:431
#22 0x000000000047c624 in runtime.asmcgocall () at /usr/local/go/src/runtime/asm_amd64.s:821
#23 0x000000c031b244e0 in ?? ()
#24 0x0000000000000004 in ?? ()
#25 0x000000c00cceaff8 in ?? ()
#26 0x000000000047eb86 in time.now () at /usr/local/go/src/runtime/time_linux_amd64.s:52
#27 0x000000000a256179 in ?? ()
#28 0x00007f177f5b2f6f in ?? ()
#29 0x0000000000800000 in ?? () at :1
#30 0x0000000000000000 in ?? ()
Thread 68 (Thread 0x7f15517fe700 (LWP 312851)):
#0 0x00007f15bb70c823 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1 0x00007f15bb469206 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007f15bb7ef2bf in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007f15bb7eff6f in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007f15bb484bf7 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#5 0x00007f15bb518928 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#6 0x00007f15bb5b9063 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#7 0x00007f15d75c3717 in ?? () from /usr/local/cuda-11.2/targets/x86_64-linux/lib/libcublas.so.11
#8 0x00007f15d75f3f15 in ?? () from /usr/local/cuda-11.2/targets/x86_64-linux/lib/libcublas.so.11
#9 0x00007f15d6c87bfc in ?? () from /usr/local/cuda-11.2/targets/x86_64-linux/lib/libcublas.so.11
#10 0x00007f15d6c89010 in ?? () from /usr/local/cuda-11.2/targets/x86_64-linux/lib/libcublas.so.11
#11 0x00007f15d6c87539 in ?? () from /usr/local/cuda-11.2/targets/x86_64-linux/lib/libcublas.so.11
#12 0x00007f15d6d4defa in cublasDestroy_v2 () from /usr/local/cuda-11.2/targets/x86_64-linux/lib/libcublas.so.11
#13 0x00007f17381d6159 in onnxruntime::CudaStream::~CudaStream() () from /root/go/src/lib/libonnxruntime_providers_tensorrt.so
#14 0x00007f17381d624d in onnxruntime::CudaStream::~CudaStream() () from /root/go/src/lib/libonnxruntime_providers_tensorrt.so
#15 0x00007f169380d60a in ?? () from /root/go/src/lib//libonnxruntime.so
#16 0x00007f169380d811 in ?? () from /root/go/src/lib//libonnxruntime.so
#17 0x00007f16930eb959 in ?? () from /root/go/src/lib//libonnxruntime.so
#18 0x00007f16930ee804 in ?? () from /root/go/src/lib//libonnxruntime.so
#19 0x00007f16930eed7d in ?? () from /root/go/src/lib//libonnxruntime.so
#20 0x000000000047c624 in runtime.asmcgocall () at /usr/local/go/src/runtime/asm_amd64.s:821
#21 0x0000000000452aed in runtime.park_m (gp=0xc0001dc340) at /usr/local/go/src/runtime/proc.go:3336
#22 0x000000c0036aa4e0 in ?? ()
#23 0x000000c0001dc340 in ?? ()
#24 0x0000000000000000 in ?? ()
`
Questions
To reproduce
No
Urgency
No
Platform
Linux
OS Version
ubuntu 20.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.15.1
ONNX Runtime API
C
Architecture
X64
Execution Provider
TensorRT
Execution Provider Library Version
TensorRT 8.6.1, Cuda 11.2
Model File
No response
Is this a quantized model?
No
The text was updated successfully, but these errors were encountered: