Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation] how to modularize ONNXRT on CPU first , then on CPU with OpenVino EP then on Nvidia GPU with TRT EP simply by adding new provider libraries and all their dependencies #23104

Open
jcdatin opened this issue Dec 13, 2024 · 1 comment
Labels
documentation improvements or additions to documentation; typically submitted using template ep:OpenVINO issues related to OpenVINO execution provider ep:TensorRT issues related to TensorRT execution provider

Comments

@jcdatin
Copy link

jcdatin commented Dec 13, 2024

Describe the documentation issue

The goal is to create modular deployment profiles (such as docker image layers) that you pile up when you add the provider capability . See the diagram below:
Image

This is supposed to work already like said in Build with different EPs - onnxruntime
Quoting :
"Execution Provider Shared Libraries

The TensorRT, and OpenVINO™ providers are built as shared libraries vs being statically linked into the main onnxruntime. This enables them to be loaded only when needed, and if the dependent libraries of the provider are not installed onnxruntime will still run fine, it just will not be able to use that provider. 

Loading the shared providers

Shared provider libraries are loaded by the onnxruntime code (do not load or depend on them in your client code…. [libraries] will be loaded at runtime when the provider is added to the session options (through a call like SessionOptionsAppendExecutionProvider_OpenVINO in the C API). If a shared provider library cannot be loaded (if the file doesn’t exist, or its dependencies don’t exist or not in the path) then an error will be returned.

The onnxruntime code will look for the provider shared libraries in the same location as the onnxruntime shared library is (or the executable statically linked to the static library version)."

However , it does not seems to work because when building ONNXRT with TRT EP , I am getting both CUDA EP and TRT EP, but , if I remove the libonnxruntime_providers_cuda.so, then my Client code does got runtime link error looking for this library despite my client code does not add (depend) on OrtSessionOptionsAppendExecutionProvider_CUDA. See [Build] Cuda Execution Provider library is needed despite we only use TensoRT Execution provider · Issue #22960 · microsoft/onnxruntime

Here is how I built ONNRT :
CC=gcc-11 CXX=g++-11 ./build.sh
--skip_submodule_sync --nvcc_threads 2
--config $ORT_BUILD_MODE --use_cuda
--cudnn_home /usr/local/cuda/lib64
--cuda_home /usr/local/cuda/
--use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT
--build_shared_lib --parallel --skip_tests
--allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75"
--cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

So can you tell how to build all provider libs and deploy them separately based on my CPU / GPU host configuration ?
Onnxruntime shall run even if a provider library is not there as the client code does not use it.

Shall I use all provider in the build such as ?
CC=gcc-11 CXX=g++-11 ./build.sh
--skip_submodule_sync --nvcc_threads 2
--config $ORT_BUILD_MODE --use_cuda
--cudnn_home /usr/local/cuda/lib64
--cuda_home /usr/local/cuda/
--use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT
--use_openvino --openvino_home /usr/local/OpenVino
--build_shared_lib --parallel --skip_tests
--allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75"
--cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

Please advise

Page / URL

No response

@jcdatin jcdatin added the documentation improvements or additions to documentation; typically submitted using template label Dec 13, 2024
@github-actions github-actions bot added ep:OpenVINO issues related to OpenVINO execution provider ep:TensorRT issues related to TensorRT execution provider labels Dec 13, 2024
@jcdatin
Copy link
Author

jcdatin commented Dec 13, 2024

Describe the documentation issue

The goal is to create modular deployment profiles (such as docker image layers) that you pile up when you add the provider capability . See the diagram below: Image

This is supposed to work already like said in Build with different EPs - onnxruntime Quoting : "Execution Provider Shared Libraries

The TensorRT, and OpenVINO™ providers are built as shared libraries vs being statically linked into the main onnxruntime. This enables them to be loaded only when needed, and if the dependent libraries of the provider are not installed onnxruntime will still run fine, it just will not be able to use that provider. 

Loading the shared providers

Shared provider libraries are loaded by the onnxruntime code (do not load or depend on them in your client code…. [libraries] will be loaded at runtime when the provider is added to the session options (through a call like SessionOptionsAppendExecutionProvider_OpenVINO in the C API). If a shared provider library cannot be loaded (if the file doesn’t exist, or its dependencies don’t exist or not in the path) then an error will be returned.

The onnxruntime code will look for the provider shared libraries in the same location as the onnxruntime shared library is (or the executable statically linked to the static library version)."

However , it does not seems to work because when building ONNXRT with TRT EP , I am getting both CUDA EP and TRT EP, but , if I remove the libonnxruntime_providers_cuda.so, then my Client code does got runtime link error looking for this library despite my client code does not add (depend) on OrtSessionOptionsAppendExecutionProvider_CUDA. See [Build] Cuda Execution Provider library is needed despite we only use TensoRT Execution provider · Issue #22960 · microsoft/onnxruntime

Here is how I built ONNRT : CC=gcc-11 CXX=g++-11 ./build.sh --skip_submodule_sync --nvcc_threads 2 --config $ORT_BUILD_MODE --use_cuda --cudnn_home /usr/local/cuda/lib64 --cuda_home /usr/local/cuda/ --use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT --build_shared_lib --parallel --skip_tests --allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75" --cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

So can you tell how to build all provider libs and deploy them separately based on my CPU / GPU host configuration ? Onnxruntime shall run even if a provider library is not there as the client code does not use it.

Shall I use all provider in the build such as ? CC=gcc-11 CXX=g++-11 ./build.sh --skip_submodule_sync --nvcc_threads 2 --config $ORT_BUILD_MODE --use_cuda --cudnn_home /usr/local/cuda/lib64 --cuda_home /usr/local/cuda/ --use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT --use_openvino --openvino_home /usr/local/OpenVino --build_shared_lib --parallel --skip_tests --allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75" --cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

Please advise

Reading the doc better , will this work if we combine tensort and openvino build options as
CC=gcc-11 CXX=g++-11 ./build.sh
--skip_submodule_sync --nvcc_threads 2
--config $ORT_BUILD_MODE --use_cuda
--cudnn_home /usr/local/cuda/lib64
--cuda_home /usr/local/cuda/
--use_tensorrt --use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT
--use_openvino CPU
--build_shared_lib --parallel --skip_tests
--allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75"
--cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation improvements or additions to documentation; typically submitted using template ep:OpenVINO issues related to OpenVINO execution provider ep:TensorRT issues related to TensorRT execution provider
Projects
None yet
Development

No branches or pull requests

1 participant