Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] Fail building OnnxRuntime with Cuda 11.2 #17961

Closed
IzanCatalan opened this issue Oct 16, 2023 · 3 comments
Closed

[Build] Fail building OnnxRuntime with Cuda 11.2 #17961

IzanCatalan opened this issue Oct 16, 2023 · 3 comments
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider

Comments

@IzanCatalan
Copy link

Describe the issue

Hi everyone, I'm trying to build onnxruntime for On-Device Training. To do so, I'm following the guidelines written in the tutorial: https://onnxruntime.ai/docs/build/training.html . I also cloned the main branch of the GitHub repo yesterday, so it is recent and the build script it is the one currently saved in the GitHub Repo.

However, my environment is not equal. I'm building onnxruntime with ubuntu 20.04 together with cuda 11.2 and cudnn 8. Cmake version is 3.27 and python version is 3.8 in a conda environment. I have some troubles with the installation

The shell command I use is: ./build.sh --config RelWithDebInfo --build_shared_lib --parallel --enable_training --allow_running_as_root --build_wheel --use_cuda --cuda_home /usr/local/cuda-11.2/ --cudnn_home /usr/local/cuda-11.2/ --cuda_version=11.2 --skip_tests

And the following commands for imports:
export CUDA_HOME=/usr/local/cuda-11.2/
export CUDNN_HOME=/usr/local/cuda-11.2/
export CUDACXX=/usr/local/cuda-11.2/bin/nvcc

And the error is the following:

In file included from /mnt/beegfs/gap/izcagal/ort/onnxruntime/onnxruntime/test/testdata/custom_op_library/cuda/cuda_ops.cc:10:
/mnt/beegfs/gap/izcagal/ort/onnxruntime/include/onnxruntime/core/providers/cuda/cuda_context.h:12:10: fatal error: cudnn.h: No such file or directory
   12 | #include <cudnn.h>
      |          ^~~~~~~~~
compilation terminated.
CMakeFiles/custom_op_library.dir/build.make:104: recipe for target 'CMakeFiles/custom_op_library.dir/mnt/beegfs/gap/izcagal/ort/onnxruntime/onnxruntime/test/testdata/custom_op_library/cuda/cuda_ops.cc.o' failed
gmake[2]: *** [CMakeFiles/custom_op_library.dir/mnt/beegfs/gap/izcagal/ort/onnxruntime/onnxruntime/test/testdata/custom_op_library/cuda/cuda_ops.cc.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
In file included from /mnt/beegfs/gap/izcagal/ort/onnxruntime/build/Linux/RelWithDebInfo/_deps/protobuf-src/src/google/protobuf/io/gzip_stream.cc:38:
/mnt/beegfs/gap/izcagal/ort/onnxruntime/build/Linux/RelWithDebInfo/_deps/protobuf-src/src/google/protobuf/io/gzip_stream.h:50:10: fatal error: zlib.h: No such file or directory
   50 | #include "zlib.h"
      |          ^~~~~~~~
compilation terminated.
_deps/protobuf-build/CMakeFiles/libprotobuf.dir/build.make:747: recipe for target '_deps/protobuf-build/CMakeFiles/libprotobuf.dir/src/google/protobuf/io/gzip_stream.cc.o' failed
gmake[2]: *** [_deps/protobuf-build/CMakeFiles/libprotobuf.dir/src/google/protobuf/io/gzip_stream.cc.o] Error 1
CMakeFiles/Makefile2:7368: recipe for target '_deps/protobuf-build/CMakeFiles/libprotobuf.dir/all' failed
gmake[1]: *** [_deps/protobuf-build/CMakeFiles/libprotobuf.dir/all] Error 2
gmake[1]: *** Waiting for unfinished jobs....

In file included from /mnt/beegfs/gap/izcagal/ort/onnxruntime/onnxruntime/test/testdata/custom_op_library/cuda/cuda_ops.cc:10:
/mnt/beegfs/gap/izcagal/ort/onnxruntime/include/onnxruntime/core/providers/cuda/cuda_context.h:12:10: fatal error: cudnn.h: No such file or directory
   12 | #include <cudnn.h>
      |          ^~~~~~~~~
compilation terminated.
CMakeFiles/custom_op_library.dir/build.make:104: recipe for target 'CMakeFiles/custom_op_library.dir/mnt/beegfs/gap/izcagal/ort/onnxruntime/onnxruntime/test/testdata/custom_op_library/cuda/cuda_ops.cc.o' failed
gmake[2]: *** [CMakeFiles/custom_op_library.dir/mnt/beegfs/gap/izcagal/ort/onnxruntime/onnxruntime/test/testdata/custom_op_library/cuda/cuda_ops.cc.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
In file included from /mnt/beegfs/gap/izcagal/ort/onnxruntime/build/Linux/RelWithDebInfo/_deps/protobuf-src/src/google/protobuf/io/gzip_stream.cc:38:
/mnt/beegfs/gap/izcagal/ort/onnxruntime/build/Linux/RelWithDebInfo/_deps/protobuf-src/src/google/protobuf/io/gzip_stream.h:50:10: fatal error: zlib.h: No such file or directory
   50 | #include "zlib.h"
      |          ^~~~~~~~
compilation terminated.
_deps/protobuf-build/CMakeFiles/libprotobuf.dir/build.make:747: recipe for target '_deps/protobuf-build/CMakeFiles/libprotobuf.dir/src/google/protobuf/io/gzip_stream.cc.o' failed
gmake[2]: *** [_deps/protobuf-build/CMakeFiles/libprotobuf.dir/src/google/protobuf/io/gzip_stream.cc.o] Error 1
CMakeFiles/Makefile2:7368: recipe for target '_deps/protobuf-build/CMakeFiles/libprotobuf.dir/all' failed
gmake[1]: *** [_deps/protobuf-build/CMakeFiles/libprotobuf.dir/all] Error 2
gmake[1]: *** Waiting for unfinished jobs....

/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/barrier.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/barrier.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/blocking_counter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/blocking_counter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/create_thread_identity.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/create_thread_identity.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/futex_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/futex_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/per_thread_sem.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/per_thread_sem.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/pthread_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/pthread_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/sem_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/sem_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/stdcpp_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/stdcpp_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/waiter_base.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/waiter_base.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/win32_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/internal/win32_waiter.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/notification.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/notification.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/mutex.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ar: warning: CMakeFiles/absl_synchronization.dir/mutex.cc.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(barrier.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(barrier.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(blocking_counter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(blocking_counter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(create_thread_identity.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(create_thread_identity.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(futex_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(futex_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(per_thread_sem.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(per_thread_sem.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(pthread_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(pthread_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(sem_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(sem_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(stdcpp_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(stdcpp_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(waiter_base.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(waiter_base.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(win32_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(win32_waiter.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(notification.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(notification.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001
/usr/bin/ranlib: warning: libabsl_synchronization.a(mutex.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010002
/usr/bin/ranlib: warning: libabsl_synchronization.a(mutex.cc.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0010001

I would like to know where is the error.

Thank you.

Urgency

No response

Target platform

Ubuntu 20.04

Build script

#!/bin/bash

Error / output

Visual Studio Version

No response

GCC / Compiler Version

9.5.0

@IzanCatalan IzanCatalan added the build build issues; typically submitted using template label Oct 16, 2023
@github-actions github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Oct 16, 2023
@snnn
Copy link
Member

snnn commented Oct 16, 2023

  1. You also need to install cudnn
  2. The minimal CUDA version we support is CUDA 11.4.
  3. I suggest not using conda when building onnxruntime from source. It's fine to run the built package with conda. But, when building it from source, a conda environment may have many extra dependencies that could conflict with something. Also, conda has onnxruntime packages. You can find read their receipt to know how to build onnxruntime in conda.

@snnn snnn closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2023
@snnn
Copy link
Member

snnn commented Oct 16, 2023

I will close it as we do not support CUDA 11.2 anymore.
You can build onnxruntime in a clean Ubuntu 20.04 environment with just GCC/G++ and CUDA/CUDNN libraries.

@IzanCatalan
Copy link
Author

@snnn I archives installing the last version of onnxruntime downloaded from github with cuda 11.2 in a docker container. It is the same version that failed me in my post. When using docker, it worked. I think because it is what you said, a conda environment may have extra dependencies.

What I have done next is to use the python wheel generated inside the docker container and install it in my conda environment. All works. However, when importing onnxruntime I get this error:

Traceback (most recent call last):
  File "docker/train.py", line 2, in <module>
    from onnxruntime.training import artifacts
  File "/mnt/beegfs/gap/izcagal/.conda/envs/onnx/lib/python3.8/site-packages/onnxruntime/__init__.py", line 56, in <module>
    raise import_capi_exception
  File "/mnt/beegfs/gap/izcagal/.conda/envs/onnx/lib/python3.8/site-packages/onnxruntime/__init__.py", line 23, in <module>
    from onnxruntime.capi._pybind_state import ExecutionMode  # noqa: F401
  File "/mnt/beegfs/gap/izcagal/.conda/envs/onnx/lib/python3.8/site-packages/onnxruntime/capi/_pybind_state.py", line 32, in <module>
    from .onnxruntime_pybind11_state import *  # noqa
ImportError: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /mnt/beegfs/gap/izcagal/.conda/envs/onnx/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_pybind11_state.so)

I have glibc 2.27, but I wonder if this error is directly because my version is old or it has a relationship with other dependencies, because when I used old onnxruntime versions like 1.4 or 1.12, this error didn't appear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider
Projects
None yet
Development

No branches or pull requests

2 participants