[Documentation] how to modularize ONNXRT on CPU first , then on CPU with OpenVino EP then on Nvidia GPU with TRT EP simply by adding new provider libraries and all their dependencies #23104

jcdatin · 2024-12-13T16:56:48Z

Describe the documentation issue

The goal is to create modular deployment profiles (such as docker image layers) that you pile up when you add the provider capability . See the diagram below:

This is supposed to work already like said in Build with different EPs - onnxruntime
Quoting :
"Execution Provider Shared Libraries

The TensorRT, and OpenVINO™ providers are built as shared libraries vs being statically linked into the main onnxruntime. This enables them to be loaded only when needed, and if the dependent libraries of the provider are not installed onnxruntime will still run fine, it just will not be able to use that provider.

Loading the shared providers

Shared provider libraries are loaded by the onnxruntime code (do not load or depend on them in your client code…. [libraries] will be loaded at runtime when the provider is added to the session options (through a call like SessionOptionsAppendExecutionProvider_OpenVINO in the C API). If a shared provider library cannot be loaded (if the file doesn’t exist, or its dependencies don’t exist or not in the path) then an error will be returned.

The onnxruntime code will look for the provider shared libraries in the same location as the onnxruntime shared library is (or the executable statically linked to the static library version)."

However , it does not seems to work because when building ONNXRT with TRT EP , I am getting both CUDA EP and TRT EP, but , if I remove the libonnxruntime_providers_cuda.so, then my Client code does got runtime link error looking for this library despite my client code does not add (depend) on OrtSessionOptionsAppendExecutionProvider_CUDA. See [Build] Cuda Execution Provider library is needed despite we only use TensoRT Execution provider · Issue #22960 · microsoft/onnxruntime

Here is how I built ONNRT :
CC=gcc-11 CXX=g++-11 ./build.sh
--skip_submodule_sync --nvcc_threads 2
--config $ORT_BUILD_MODE --use_cuda
--cudnn_home /usr/local/cuda/lib64
--cuda_home /usr/local/cuda/
--use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT
--build_shared_lib --parallel --skip_tests
--allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75"
--cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

So can you tell how to build all provider libs and deploy them separately based on my CPU / GPU host configuration ?
Onnxruntime shall run even if a provider library is not there as the client code does not use it.

Shall I use all provider in the build such as ?
CC=gcc-11 CXX=g++-11 ./build.sh
--skip_submodule_sync --nvcc_threads 2
--config $ORT_BUILD_MODE --use_cuda
--cudnn_home /usr/local/cuda/lib64
--cuda_home /usr/local/cuda/
--use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT
--use_openvino --openvino_home /usr/local/OpenVino
--build_shared_lib --parallel --skip_tests
--allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75"
--cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

Please advise

Page / URL

No response

The text was updated successfully, but these errors were encountered:

jcdatin · 2024-12-13T17:06:16Z

Describe the documentation issue

The goal is to create modular deployment profiles (such as docker image layers) that you pile up when you add the provider capability . See the diagram below:

This is supposed to work already like said in Build with different EPs - onnxruntime Quoting : "Execution Provider Shared Libraries
The TensorRT, and OpenVINO™ providers are built as shared libraries vs being statically linked into the main onnxruntime. This enables them to be loaded only when needed, and if the dependent libraries of the provider are not installed onnxruntime will still run fine, it just will not be able to use that provider. 
Loading the shared providers
Shared provider libraries are loaded by the onnxruntime code (do not load or depend on them in your client code…. [libraries] will be loaded at runtime when the provider is added to the session options (through a call like SessionOptionsAppendExecutionProvider_OpenVINO in the C API). If a shared provider library cannot be loaded (if the file doesn’t exist, or its dependencies don’t exist or not in the path) then an error will be returned.

The onnxruntime code will look for the provider shared libraries in the same location as the onnxruntime shared library is (or the executable statically linked to the static library version)."
However , it does not seems to work because when building ONNXRT with TRT EP , I am getting both CUDA EP and TRT EP, but , if I remove the libonnxruntime_providers_cuda.so, then my Client code does got runtime link error looking for this library despite my client code does not add (depend) on OrtSessionOptionsAppendExecutionProvider_CUDA. See [Build] Cuda Execution Provider library is needed despite we only use TensoRT Execution provider · Issue #22960 · microsoft/onnxruntime

Here is how I built ONNRT : CC=gcc-11 CXX=g++-11 ./build.sh --skip_submodule_sync --nvcc_threads 2 --config $ORT_BUILD_MODE --use_cuda --cudnn_home /usr/local/cuda/lib64 --cuda_home /usr/local/cuda/ --use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT --build_shared_lib --parallel --skip_tests --allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75" --cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

So can you tell how to build all provider libs and deploy them separately based on my CPU / GPU host configuration ? Onnxruntime shall run even if a provider library is not there as the client code does not use it.

Shall I use all provider in the build such as ? CC=gcc-11 CXX=g++-11 ./build.sh --skip_submodule_sync --nvcc_threads 2 --config $ORT_BUILD_MODE --use_cuda --cudnn_home /usr/local/cuda/lib64 --cuda_home /usr/local/cuda/ --use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT --use_openvino --openvino_home /usr/local/OpenVino --build_shared_lib --parallel --skip_tests --allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75" --cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

Please advise

Reading the doc better , will this work if we combine tensort and openvino build options as
CC=gcc-11 CXX=g++-11 ./build.sh
--skip_submodule_sync --nvcc_threads 2
--config $ORT_BUILD_MODE --use_cuda
--cudnn_home /usr/local/cuda/lib64
--cuda_home /usr/local/cuda/
--use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT
--use_openvino CPU
--build_shared_lib --parallel --skip_tests
--allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75"
--cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

jcdatin added the documentation improvements or additions to documentation; typically submitted using template label Dec 13, 2024

github-actions bot added ep:OpenVINO issues related to OpenVINO execution provider ep:TensorRT issues related to TensorRT execution provider labels Dec 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Documentation] how to modularize ONNXRT on CPU first , then on CPU with OpenVino EP then on Nvidia GPU with TRT EP simply by adding new provider libraries and all their dependencies #23104

[Documentation] how to modularize ONNXRT on CPU first , then on CPU with OpenVino EP then on Nvidia GPU with TRT EP simply by adding new provider libraries and all their dependencies #23104

jcdatin commented Dec 13, 2024

jcdatin commented Dec 13, 2024

Describe the documentation issue

[Documentation] how to modularize ONNXRT on CPU first , then on CPU with OpenVino EP then on Nvidia GPU with TRT EP simply by adding new provider libraries and all their dependencies #23104

[Documentation] how to modularize ONNXRT on CPU first , then on CPU with OpenVino EP then on Nvidia GPU with TRT EP simply by adding new provider libraries and all their dependencies #23104

Comments

jcdatin commented Dec 13, 2024

Describe the documentation issue

Page / URL

jcdatin commented Dec 13, 2024

Describe the documentation issue