Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CUDA12 support for Java's onnxruntime_gpu dependency #19960

Closed
davidecaroselli opened this issue Mar 17, 2024 · 13 comments
Closed

Add CUDA12 support for Java's onnxruntime_gpu dependency #19960

davidecaroselli opened this issue Mar 17, 2024 · 13 comments
Assignees
Labels
api:Java issues related to the Java API ep:CUDA issues related to the CUDA execution provider release:1.18.0

Comments

@davidecaroselli
Copy link

Describe the issue

When trying to use Java's onnxruntime_gpu:1.17.1 runtime on a CUDA 12 system, the program fails to load libonnxruntime_providers_cuda.so library because it searches for CUDA 11.x dependencies.

However, this issue seems to be already solved with (nearly) all runtimes except Java AFAIK: Install ONNX Runtime.

Can this be ported to Maven Central build too, please?

To reproduce

On a system with CUDA 12.3 installed:

$ nvidia-smi

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.54.03              Driver Version: 535.54.03    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
...

And a Java Maven project using the latest available version of onnxruntime_gpu:

<dependency>
    <groupId>com.microsoft.onnxruntime</groupId>
    <artifactId>onnxruntime_gpu</artifactId>
    <version>1.17.1</version>
</dependency>

You can reproduce the problem simply by running this Java main:

package org.example;

import ai.onnxruntime.OrtException;
import ai.onnxruntime.OrtSession;

public class App {

    public static void main(String[] args) throws OrtException {
        new OrtSession.SessionOptions().addCUDA(0);
    }

}

Resulting in thr following error:

Exception in thread "main" ai.onnxruntime.OrtException: Error code - ORT_RUNTIME_EXCEPTION - message: /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1209 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcublasLt.so.11: cannot open shared object file: No such file or directory

	at ai.onnxruntime.OrtSession$SessionOptions.addCUDA(Native Method)
	at ai.onnxruntime.OrtSession$SessionOptions.addCUDA(OrtSession.java:1009)
	at org.example.App.main(App.java:9)

Urgency

Currently development of internal library is blocked because this issue makes impossible to run any Java-ONNX project on our new deployment with newest NVIDIA GPUs (i.e. GH200) as they require the latest drivers and CUDA library.

Platform

Linux

OS Version

Ubuntu 20.04.6 LTS

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.1

ONNX Runtime API

Java

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 12.3

@github-actions github-actions bot added api:Java issues related to the Java API ep:CUDA issues related to the CUDA execution provider labels Mar 17, 2024
@Craigacp
Copy link
Contributor

You can compile it from source with CUDA 12 support.

@davidecaroselli
Copy link
Author

Hi @Craigacp and thanks for the advice.

I was able to compile the library from source using the attached Dockerfile, however there is an important caveat: It seems to me that ONNX runtime only supports cuDNN v8, while all latest NVIDIA CUDA images come with cuDNN v9.

If I try to compile FROM nvidia/cuda:12.3.2-cudnn9-devel-ubuntu22.04, I get multiple errors like:

error: ‘cudnnSetRNNDescriptor_v6’ was not declared in this scope; did you mean ‘cudnnSetRNNDescriptor_v8’?
error: ‘cudnnSetRNNMatrixMathType’ was not declared in this scope; did you mean ‘cudnnSetConvolutionMathType’?
[...]

This is the Dockerfile I used:

FROM nvidia/cuda:12.1.0-cudnn8-devel-ubuntu22.04

RUN apt-get update && apt-get install -y --no-install-recommends python3-dev ca-certificates g++ python3-numpy gcc make git python3-setuptools python3-wheel python3-packaging python3-pip aria2 unzip wget openjdk-17-jdk && \
    aria2c -q -d /tmp -o cmake-3.27.3-linux-x86_64.tar.gz https://github.com/Kitware/CMake/releases/download/v3.27.3/cmake-3.27.3-linux-x86_64.tar.gz && \
    tar -zxf /tmp/cmake-3.27.3-linux-x86_64.tar.gz --strip=1 -C /usr && rm /tmp/cmake-3.27.3-linux-x86_64.tar.gz && \
    wget -c https://services.gradle.org/distributions/gradle-8.6-bin.zip -P /tmp && unzip /tmp/gradle-8.6-bin.zip -d /opt/ && rm /tmp/gradle-8.6-bin.zip

ENV GRADLE_HOME=/opt/gradle-8.6
ENV PATH=${GRADLE_HOME}/bin:${PATH}

COPY onnxruntime /onnxruntime

RUN git config --global --add safe.directory /onnxruntime && cd /onnxruntime && git checkout -- . && git clean -fd . && \
    git checkout v1.17.1 && python3 -m pip install -r tools/ci_build/github/linux/docker/inference/x64/python/cpu/scripts/requirements.txt && \
    ./build.sh --allow_running_as_root --skip_submodule_sync --cuda_home /usr/local/cuda --cudnn_home /usr/lib/x86_64-linux-gnu/ \
               --use_cuda --config Release --build_shared_lib --build_java --update --build --parallel --cmake_extra_defines ONNXRUNTIME_VERSION=$(cat ./VERSION_NUMBER) 'CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;86'

So my follow-up questions are:

  1. Are there any plans to make this build available in the official Maven Central repository ?
  2. Are there any plans to support cuDNN 9? And/or is there any option to build ONNX runtime without cuDNN dependency?

@Craigacp
Copy link
Contributor

Craigacp commented Mar 18, 2024

cuDNN 9 came out after ORT 1.17 (#19419), so it probably won't be supported until at least the next feature release.

We're discussing what to do about CUDA 12 binaries for Java, whether to drop CUDA 11 completely or make two releases. It's not been decided yet.

@davidecaroselli
Copy link
Author

Got it, thanks! I think cuDNN 9 would not be a huge problem for now as I can manually install cuDNN 8 in the docker file.

My two cents: a solution could be to create two different artifacts, like 1.17.1-cu11 and 1.17.1-cu12, you can always drop the first one as soon as you don't feel supporting it anymore.

One last problem I'm facing right now: I have just realized that the build I made on Ubuntu 22.04, won't work on Ubuntu 20.04 because of different libc.6.so version:

Caused by: java.lang.UnsatisfiedLinkError: /tmp/onnxruntime-java1823669597081387394/libonnxruntime.so: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /tmp/onnxruntime-java1823669597081387394/libonnxruntime.so)

As on my 20.04 machine I have /lib/x86_64-linux-gnu/libc-2.31.so. Just wondering, how did you solve this problem on the Java release? As it appears to me that the same Maven JAR works well in both versions of Ubuntu.

Is there a specific flag I can use during compilation to avoid dynamic linking to a specific version of libc?

@Craigacp
Copy link
Contributor

Not that I'm aware. I think the release is compiled on 20.04.

@davidecaroselli
Copy link
Author

Thanks, I'll give it a try!

@snnn
Copy link
Member

snnn commented Mar 18, 2024

Is there a specific flag I can use during compilation to avoid dynamic linking to a specific version of libc?

No. If you still need to support Ubuntu 20.04, consider using RHEL/CentOS(or UBI8) with "Red Hat Developer Toolset" to compile to code.

@davidecaroselli
Copy link
Author

Hi @snnn and thanks for the hint!

I did try to build onnxruntime starting from nvidia/cuda:12.1.1-cudnn8-devel-ubi8 image, however I didn't expect it to be sooo painful 😅.

After a couple of hours of trial-and-error, I was able to spot several changes to overcome many compilation problems:

  1. Build protobuf from source and statically linking it with ONNX_USE_PROTOBUF_SHARED_LIBS=OFF.
  2. Enforce C++17 standard with CMAKE_CXX_STANDARD=17 and CMAKE_CXX_STANDARD_REQUIRED=ON.
  3. Create a manual symbolic link ln -s /usr/lib64 /usr/lib/x86_64-linux-gnu as some dependency has /usr/lib/x86_64-linux-gnu hardcoded in their CMake file.
  4. Skip unit tests build with onnxruntime_BUILD_UNIT_TESTS=OFF as many of them were failing to compile.

Despite all these precautions, I'm still not able to compile onnxruntime because of this error:

...
[ 61%] Linking CXX shared library libonnxruntime.so
[ 97%] Built target onnxruntime_providers_cuda
> Task :clean
> Task :spotlessInternalRegisterDependencies
libonnxruntime_providers.a(matmul_fpq4.cc.o): In function `onnxruntime::contrib::MatMulFpQ4::Compute(onnxruntime::OpKernelContext*) const':
matmul_fpq4.cc:(.text._ZNK11onnxruntime7contrib10MatMulFpQ47ComputeEPNS_15OpKernelContextE+0x4e2): undefined reference to `MlasQ4GemmPackBSize(MLAS_BLK_QUANT_TYPE, unsigned long, unsigned long)'
matmul_fpq4.cc:(.text._ZNK11onnxruntime7contrib10MatMulFpQ47ComputeEPNS_15OpKernelContextE+0x773): undefined reference to `MlasQ4GemmBatch(MLAS_BLK_QUANT_TYPE, unsigned long, unsigned long, unsigned long, unsigned long, MLAS_Q4_GEMM_DATA_PARAMS const*, onnxruntime::concurrency::ThreadPool*)'
libonnxruntime_providers.a(matmul_nbits.cc.o): In function `onnxruntime::contrib::MatMulNBits::Compute(onnxruntime::OpKernelContext*) const':
matmul_nbits.cc:(.text._ZNK11onnxruntime7contrib11MatMulNBits7ComputeEPNS_15OpKernelContextE+0x1264): undefined reference to `void MlasDequantizeBlockwise<float, 4>(float*, unsigned char const*, float const*, unsigned char const*, int, bool, int, int, onnxruntime::concurrency::ThreadPool*)'
libonnxruntime_graph.a(contrib_defs.cc.o): In function `onnxruntime::contrib::matmulQ4ShapeInference(onnx::InferenceContext&, int, int, int, MLAS_BLK_QUANT_TYPE) [clone .constprop.883]':
contrib_defs.cc:(.text._ZN11onnxruntime7contribL22matmulQ4ShapeInferenceERN4onnx16InferenceContextEiii19MLAS_BLK_QUANT_TYPE.constprop.883+0x2e8): undefined reference to `MlasQ4GemmPackBSize(MLAS_BLK_QUANT_TYPE, unsigned long, unsigned long)'
libonnxruntime_mlas.a(platform.cpp.o): In function `MLAS_PLATFORM::MLAS_PLATFORM()':
platform.cpp:(.text._ZN13MLAS_PLATFORMC2Ev+0x574): undefined reference to `MlasFpQ4GemmDispatchAvx512'
platform.cpp:(.text._ZN13MLAS_PLATFORMC2Ev+0x5b1): undefined reference to `MlasQ8Q4GemmDispatchAvx512vnni'
collect2: error: ld returned 1 exit status
gmake[2]: *** [CMakeFiles/onnxruntime.dir/build.make:172: libonnxruntime.so.1.17.1] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:2113: CMakeFiles/onnxruntime.dir/all] Error 2
...

...and at this point I'm out of ideas on why it's failing...

Here's the Dockerfile I created so far:

FROM nvidia/cuda:12.1.1-cudnn8-devel-ubi8

ENV DEBIAN_FRONTEND=noninteractive

COPY onnxruntime /onnxruntime

RUN yum install -y zlib-devel python39-devel python39-numpy python39-setuptools python39-wheel python39-pip git unzip wget java-1.8.0-devel && \
    wget https://github.com/Kitware/CMake/releases/download/v3.27.3/cmake-3.27.3-linux-x86_64.tar.gz && \
    tar -zxf cmake-3.27.3-linux-x86_64.tar.gz --strip=1 -C /usr && rm -f cmake-3.27.3-linux-x86_64.tar.gz && \
    wget https://services.gradle.org/distributions/gradle-8.6-bin.zip && unzip gradle-8.6-bin.zip -d /opt/ && rm -f gradle-8.6-bin.zip

RUN git clone https://github.com/protocolbuffers/protobuf.git && cd protobuf && git checkout v21.12 && git submodule update --init --recursive && mkdir build_source && cd build_source && \
    cmake ../cmake  -DCMAKE_INSTALL_LIBDIR=lib64 -Dprotobuf_BUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_POSITION_INDEPENDENT_CODE=ON -Dprotobuf_BUILD_TESTS=OFF -DCMAKE_BUILD_TYPE=Release && \
    make -j$(nproc) && make install

ENV GRADLE_HOME=/opt/gradle-8.6
ENV PATH=${GRADLE_HOME}/bin:${PATH}

RUN git config --global --add safe.directory /onnxruntime && cd /onnxruntime && git checkout -- . && git clean -fd . && \
    git checkout v1.17.1 && python3 -m pip install -r tools/ci_build/github/linux/docker/inference/x64/python/cpu/scripts/requirements.txt && \
    ln -s /usr/lib64 /usr/lib/x86_64-linux-gnu && ./build.sh --allow_running_as_root --skip_submodule_sync --compile_no_warning_as_error --skip_tests \
    --use_cuda --cuda_home /usr/local/cuda --cudnn_home /usr/lib64/ --config Release --build_java --update --build --parallel --cmake_extra_defines \
    ONNXRUNTIME_VERSION=$(cat ./VERSION_NUMBER) CMAKE_CUDA_ARCHITECTURES="52;60;61;70;75;86" CMAKE_CXX_STANDARD=17 CMAKE_CXX_STANDARD_REQUIRED=ON \
    ONNX_USE_PROTOBUF_SHARED_LIBS=OFF onnxruntime_BUILD_UNIT_TESTS=OFF

@davidecaroselli
Copy link
Author

Update: I was (finally) able to build onnxruntime on *-ubi8 image by:

  1. Removing onnxruntime_mlas_q4dq target (it failed for pthread problems) by changing this line with a simple if (FALSE):

    if (NOT onnxruntime_ORT_MINIMAL_BUILD)

  2. Build script was not able to find JNI headers even if JAVA_HOME was properly set, so I forced those files like this:

for f in $(find $JAVA_HOME -name "*.h"); do ln -s $f /usr/include/$(basename $f); done

This is the final Dockerfile used to build onnxruntime_gpu:1.17.1-cu12: Dockerfile.ubi8

Would you accept a PR for this? If yes, do you see a more proper way to skip onnxruntime_mlas_q4dq build?

@tianleiwu
Copy link
Contributor

tianleiwu commented Mar 21, 2024

Would you accept a PR for this? If yes, do you see a more proper way to skip onnxruntime_mlas_q4dq build?

Feel free to contribute a PR. I think you can add a build flag like onnxruntime_BUILD_MLAS_Q4DQ (example). Then replace the line to if (onnxruntime_BUILD_MLAS_Q4DQ)

@lanking520
Copy link

Hi, do we have any updates for CUDA 12 support for ONNXRuntime Java?

@davidecaroselli
Copy link
Author

Hi @lanking520 ! Unfortunately my PR (#20011) is blocked waiting for someone to review it. Still you can build it directly from my fork: the code is tested and I currently have the build in production in my environment.

@snnn do you have any update on the PR? Is there anything I can do to facilitate its merge? Thank you!

@jchen351
Copy link
Contributor

It is enabled with competition of #20583, and will be release with along with Onnxruntime 1.18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api:Java issues related to the Java API ep:CUDA issues related to the CUDA execution provider release:1.18.0
Projects
None yet
Development

No branches or pull requests

6 participants