Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM v1.7.0 with any example (demo, example with multi GPU...) #1864

Open
moncio opened this issue Feb 2, 2021 · 18 comments
Open

OOM v1.7.0 with any example (demo, example with multi GPU...) #1864

moncio opened this issue Feb 2, 2021 · 18 comments
Labels
duplicate This issue or pull request already exists

Comments

@moncio
Copy link

moncio commented Feb 2, 2021

Issue Summary

The precompiled OpenPoseDemo v1.7.0 with hands and face disabled will crash on my 11GB GPU due to lack of memory.

Executed Command (if any)

./build/openpose/examples/openpose.bin --model_folder models/ --video examples/media/video.avi

The same if I try to run a bundle of images with the script: https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/examples/tutorial_api_python/05_keypoints_from_images_multi_gpu.py

OpenPose Output (if any)

Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting camera index... Detected and opened camera 0.
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.
F0201 02:28:29.051620 17604 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***

Errors (if any)
Check failed: error == cudaSuccess (2 vs. 0) out of memory

Type of Issue
Execution error

Your System Configuration:

Operationg system: Ubuntu 20.04
Graphics card: Nvidia RTX 2080 11GB Driver Version: 450.10
CUDA 11.0.3, CuDNN: 8.0.5
Caffe, OpenCV: Default from OpenPose
OpenPose version 1.7.0 (GPU release), from:

https://github.com/CMU-Perceptual-Computing-Lab/openpose/releases/tag/v1.7.0

@Witek-
Copy link

Witek- commented Feb 2, 2021

I think it's the same as reported here: #1861 although I am able to run it without face and hands.
Try with version 1.6.0. Version 1.7.0 is bugged.

@moncio
Copy link
Author

moncio commented Feb 2, 2021

Version 1.6.0 is proved that it works for this config? @Witek- @gineshidalgo99

@Witek-
Copy link

Witek- commented Feb 2, 2021

I have a 6GB GPU and I can run the precompiled OpenPose demo with hands and face

@moncio
Copy link
Author

moncio commented Feb 2, 2021

I always have those issues when I'm compiling the library (v1.6.0) (particularly Caffe):

image

image

@Witek- @gineshidalgo99

I was investigating, that's came to this: BVLC/caffe#6970
In this thread, it invokes to this repo: https://github.com/Qengineering/caffe
I tried to replace caffe repo for the previous one, all looks ok but the problems appeared again, coming back to this: #1729

And when you didnt expect more problems... yes...

image

Complete hell, sorry for the problems and thank you for the help.
Hope we can do it

@moncio
Copy link
Author

moncio commented Feb 3, 2021

Another tip, I just tried to compile without USE_CUDNN. It's working consuming 5GB without face and hands.

But when I run the script: https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/examples/tutorial_api_python/05_keypoints_from_images_multi_gpu.py

Error: Segmentation fault (core dumped) (when it does: opWrapper.waitAndPop(datums))

@gineshidalgo99
Copy link
Member

gineshidalgo99 commented Feb 7, 2021

Windows issue - cuDNN not being used

This is my Caffe repo to compile in Windows: https://github.com/gineshidalgo99/caffeCompilerForWindowsAndCUDA It is based on the Windows Caffe one.

I have not been able to compile cuDNN for Windows, it keeps giving me this error:

[Many other logs]
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_ops_infer64_8.dll'. Module was built without symbols.
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_cnn_infer64_8.dll'. Module was built without symbols.
F0207 11:36:55.959534  5612 cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0)  CUDNN_STATUS_NOT_INITIALIZED
Unhandled exception at 0x00007FFFA5DD286E (ucrtbase.dll) in OpenPoseDemo.exe: Fatal program exit requested.

The program '[7240] OpenPoseDemo.exe' has exited with code 0 (0x0).

If anybody is able to get it to work without giving the CUDNN_STATUS_NOT_INITIALIZED error, I'd very highly appreciate some hints of the exact CUDA/cuDNN version and/or instructions to get it to work! :)

PS: It must use CUDA >= 11 to be compatible with Nvidia 30XX cards

Please, continue this Ubuntu discursion in #1845, to centralize messages and hopefully focus efforts to fix the issue. Thanks!

Ubuntu issue - cuDNN using too much memory

For Ubuntu users with memory issues, v1.7.0 was modified to allow cuDNN 8, which was a pain. I am not an expert, so I am sure there must be a better way to run the cuDNN convolutions using less memory, but I am not an expert on it. I am very open to suggestions about the cudnn_conv implementation to minimize memory:
https://github.com/CMU-Perceptual-Computing-Lab/caffe/blob/master/src/caffe/layers/cudnn_conv_layer.cpp

@botransfer
Copy link

I actually found that, with cnDNN, not just more GPU memory is consumed, but also, in most cases, the performance degrades (see attached). Perhaps it is better, for the time being, to change the default cmake setting not to use cuDNN.

openpose.xlsx

By the way, with the newer version of openpose, I found that RTX 3090 can process images much faster than Titan RTX.
Thank you for your efforts!

@isakengstrom
Copy link

Ubuntu issue - cuDNN using too much memory

For Ubuntu users with memory issues, v1.7.0 was modified to allow cuDNN 8, which was a pain. I am not an expert, so I am sure there must be a better way to run the cuDNN convolutions using less memory, but I am not an expert on it. I am very open to suggestions about the cudnn_conv implementation to minimize memory:
https://github.com/CMU-Perceptual-Computing-Lab/caffe/blob/master/src/caffe/layers/cudnn_conv_layer.cpp

Hi, I've had some memory problems running the latest version of OpenPose from source on Ubuntu with cuDNN as well, and found that it requires less memory running without cuDNN. If this is a known issue, maybe it could be helpful for others to add a note about this in the prerequisites or similar? @gineshidalgo99

@stale
Copy link

stale bot commented Jun 11, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale/old label Jun 11, 2021
@zhuwei-jim
Copy link

When my pc is ubuntu20, cuda11.4, run openpose1.7, used 6Gb GPU memory too. But in ubuntu16, cuda10.1, the gpu memory used is 3Gb. My GPU is GTX1080.

@stale stale bot removed the stale/old label Aug 15, 2021
@orestis-z
Copy link

Ubuntu issue - cuDNN using too much memory

For Ubuntu users with memory issues, v1.7.0 was modified to allow cuDNN 8, which was a pain. I am not an expert, so I am sure there must be a better way to run the cuDNN convolutions using less memory, but I am not an expert on it. I am very open to suggestions about the cudnn_conv implementation to minimize memory:
https://github.com/CMU-Perceptual-Computing-Lab/caffe/blob/master/src/caffe/layers/cudnn_conv_layer.cpp

@gineshidalgo99 Thinking to give it a try to minimize the memory consumption. Any suspicions where the memory inefficiencies derive from?

@italosalgado14
Copy link

@orestis-z try the moncio sugestion: dont use cudnn in flags of cmake-gui, the memory usage is 3 Gb (with cudnn is 7.6gb!) in my nvidia 3070, cuda 11.4, cudnn 8.2.4. Command: build/examples/openpose/openpose.bin --video examples/media/video.avi --net_resolution "-512x256" --logging_level 0 --disable_multi_thread --disable_blending

@orestis-z
Copy link

orestis-z commented Oct 26, 2021

@italosalgado14, interesting, what OS / OS version are you running on?

My suggestion was to fix the cuDNN issue, not to leave it out.

@orestis-z
Copy link

gineshidalgo99 please find here a working caffe patch (for inference only):
https://github.com/orestis-z/caffe

Diff here:
CMU-Perceptual-Computing-Lab/caffe@master...orestis-z:master

Caffe compilation similar to:

mkdir build && cd build && cmake .. \
  -DUSE_CUDNN=ON \
  # -DNN_PREFER_FASTEST_ALGORITHMS=OFF \ # uncomment if you want to trade memory for speed
  -DCMAKE_BUILD_TYPE=Release \
  -DBUILD_docs=OFF \
  -DBUILD_python=OFF \
  -DBUILD_python_layer=OFF \
  -DUSE_LEVELDB=OFF \
  -DUSE_LMDB=OFF \
  -DUSE_OPENCV=OFF

You need to compile openpose by specifying your custom caffe build similar to this:

cmake .. \
  -DBUILD_CAFFE=OFF \
  -DCaffe_INCLUDE_DIRS=/path/to/caffe/include \
  -DCaffe_LIBS=/usr/lib/libcaffe.so

@HospitableHost
Copy link

HospitableHost commented Oct 6, 2022

Hey! Try to use cudnn 7.5.1, and still use cuda 11.0.3, i.e. not to change your cuda.
I have the same problem in my env(ubuntu 18.04, cuda 11.1, cudnn 8.0.5), and I try to use cudnn 7.5.1 and modify the cmake files(openpose/cmake/Modules/FindCuDNN.cmake&openpose/3rdparty/caffe/cmake/Cuda.cmake) to locate the cudnn 7.5.1 include and lib. After that, the problem "error == cudaSuccess (2 vs. 0) out of memory" is solved.

cudnn 7.5.1 download url: https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.5.1/prod/10.1_20190418/cudnn-10.1-linux-x64-v7.5.1.10.tgz

after I fixed this problem, I run openpose.bin to test, but unfortunately openpose.bin doesn't use gpu(GPU-Util is 0%), it actually use cpu, and the process is really slow.

(two days later)good news: I successfully run openpose.bin with gpu. Try to use cudnn [cuDNN v8.4.1 (May 27th, 2022), for CUDA 11.x]
https://developer.nvidia.com/compute/cudnn/secure/8.4.1/local_installers/11.6/cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive.tar.xz

@HospitableHost
Copy link

HospitableHost commented Oct 6, 2022

Windows issue - cuDNN not being used

This is my Caffe repo to compile in Windows: https://github.com/gineshidalgo99/caffeCompilerForWindowsAndCUDA It is based on the Windows Caffe one.

I have not been able to compile cuDNN for Windows, it keeps giving me this error:

[Many other logs]
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_ops_infer64_8.dll'. Module was built without symbols.
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_cnn_infer64_8.dll'. Module was built without symbols.
F0207 11:36:55.959534  5612 cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0)  CUDNN_STATUS_NOT_INITIALIZED
Unhandled exception at 0x00007FFFA5DD286E (ucrtbase.dll) in OpenPoseDemo.exe: Fatal program exit requested.

The program '[7240] OpenPoseDemo.exe' has exited with code 0 (0x0).

If anybody is able to get it to work without giving the CUDNN_STATUS_NOT_INITIALIZED error, I'd very highly appreciate some hints of the exact CUDA/cuDNN version and/or instructions to get it to work! :)

PS: It must use CUDA >= 11 to be compatible with Nvidia 30XX cards

Please, continue this Ubuntu discursion in #1845, to centralize messages and hopefully focus efforts to fix the issue. Thanks!

Ubuntu issue - cuDNN using too much memory

For Ubuntu users with memory issues, v1.7.0 was modified to allow cuDNN 8, which was a pain. I am not an expert, so I am sure there must be a better way to run the cuDNN convolutions using less memory, but I am not an expert on it. I am very open to suggestions about the cudnn_conv implementation to minimize memory: https://github.com/CMU-Perceptual-Computing-Lab/caffe/blob/master/src/caffe/layers/cudnn_conv_layer.cpp

Hey! It seems that using cudnn 7.5.1 can work ,despite using cuda 11.
I have the same problem in my env(ubuntu 18.04, cuda 11.1, cudnn 8.0.5), and I try to use cudnn 7.5.1 and modify the cmake files(openpose/cmake/Modules/FindCuDNN.cmake&openpose/3rdparty/caffe/cmake/Cuda.cmake) to locate the cudnn 7.5.1 include and lib. After that, the problem "error == cudaSuccess (2 vs. 0) out of memory" and "CUDNN_STATUS_NOT_INITIALIZED" is solved.
But! When I use the command "./build/examples/openpose/openpose.bin --video examples/media/video.avi --write_video output/result.avi --display 0" to test, but unfortunately openpose.bin doesn't use gpu(GPU-Util is 0%), it actually use cpu, and the process is really slow.

(two days later)good news: I successfully run openpose.bin with gpu. Try to use cudnn [cuDNN v8.4.1 (May 27th, 2022), for CUDA 11.x]
https://developer.nvidia.com/compute/cudnn/secure/8.4.1/local_installers/11.6/cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive.tar.xz

@ohadOrbach
Copy link

Windows issue - cuDNN not being used

This is my Caffe repo to compile in Windows: https://github.com/gineshidalgo99/caffeCompilerForWindowsAndCUDA It is based on the Windows Caffe one.
I have not been able to compile cuDNN for Windows, it keeps giving me this error:

[Many other logs]
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_ops_infer64_8.dll'. Module was built without symbols.
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_cnn_infer64_8.dll'. Module was built without symbols.
F0207 11:36:55.959534  5612 cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0)  CUDNN_STATUS_NOT_INITIALIZED
Unhandled exception at 0x00007FFFA5DD286E (ucrtbase.dll) in OpenPoseDemo.exe: Fatal program exit requested.

The program '[7240] OpenPoseDemo.exe' has exited with code 0 (0x0).

If anybody is able to get it to work without giving the CUDNN_STATUS_NOT_INITIALIZED error, I'd very highly appreciate some hints of the exact CUDA/cuDNN version and/or instructions to get it to work! :)
PS: It must use CUDA >= 11 to be compatible with Nvidia 30XX cards
Please, continue this Ubuntu discursion in #1845, to centralize messages and hopefully focus efforts to fix the issue. Thanks!

Ubuntu issue - cuDNN using too much memory

For Ubuntu users with memory issues, v1.7.0 was modified to allow cuDNN 8, which was a pain. I am not an expert, so I am sure there must be a better way to run the cuDNN convolutions using less memory, but I am not an expert on it. I am very open to suggestions about the cudnn_conv implementation to minimize memory: https://github.com/CMU-Perceptual-Computing-Lab/caffe/blob/master/src/caffe/layers/cudnn_conv_layer.cpp

Hey! It seems that using cudnn 7.5.1 can work ,despite using cuda 11. I have the same problem in my env(ubuntu 18.04, cuda 11.1, cudnn 8.0.5), and I try to use cudnn 7.5.1 and modify the cmake files(openpose/cmake/Modules/FindCuDNN.cmake&openpose/3rdparty/caffe/cmake/Cuda.cmake) to locate the cudnn 7.5.1 include and lib. After that, the problem "error == cudaSuccess (2 vs. 0) out of memory" and "CUDNN_STATUS_NOT_INITIALIZED" is solved. But! When I use the command "./build/examples/openpose/openpose.bin --video examples/media/video.avi --write_video output/result.avi --display 0" to test, but unfortunately openpose.bin doesn't use gpu(GPU-Util is 0%), it actually use cpu, and the process is really slow.

(two days later)good news: I successfully run openpose.bin with gpu. Try to use cudnn [cuDNN v8.4.1 (May 27th, 2022), for CUDA 11.x] https://developer.nvidia.com/compute/cudnn/secure/8.4.1/local_installers/11.6/cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive.tar.xz

Hi, can you explain how you solved the problem with the new version of cudnn? you just installed the new one and it worked or you needed to modify cmake or somthing else?
and you modift youe cmake can explain how?

thx

@davidpagnon
Copy link

I personnally just did !apt install --allow-change-held-packages libcudnn8=8.1.0.77-1+cuda11.2 to use compatible CUDA and cuDNN versions, and it magically worked (see my Sports2D colab: Open In Colab)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests