[TensorRT EP] Enable a minimal CUDA EP compilation without kernels #19052

gedoensmax · 2024-01-08T23:05:40Z

Adresses #18542.
I followed the advice given by @RyanUnderhill here and went with a minimal CUDA EP for now.

… kernels

tianleiwu · 2024-01-11T17:27:50Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

tianleiwu · 2024-01-11T17:27:58Z

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, ONNX Runtime React Native CI Pipeline, Windows x64 QNN CI Pipeline

tianleiwu · 2024-01-11T17:28:04Z

/azp run Linux MIGraphX CI Pipeline, orttraining-amd-gpu-ci-pipeline

azure-pipelines · 2024-01-11T17:28:16Z

Azure Pipelines successfully started running 2 pipeline(s).

azure-pipelines · 2024-01-11T17:28:26Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-01-11T17:28:33Z

Azure Pipelines successfully started running 9 pipeline(s).

tianleiwu · 2024-01-11T21:22:43Z

@gedoensmax,

What's the example build command line to use this?

I tried the following but there is build error:

export CUDA_HOME=/usr/local/cuda-12.2
export CUDNN_HOME=/usr/lib/x86_64-linux-gnu/
export CUDACXX=/usr/local/cuda-12.2/bin/nvcc
export TRT_HOME=/usr/src/tensorrt

sh build.sh --config Release  --build_shared_lib --parallel --cuda_version 12.2 \
            --cuda_home $CUDA_HOME --cudnn_home $CUDNN_HOME --build_wheel --skip_tests \
            --use_tensorrt --tensorrt_home $TRT_HOME \
            --cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF \
            --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=80 \
            --cmake_extra_defines onnxruntime_CUDA_MINIMAL=ON

BTW, there are conflicts.

gedoensmax · 2024-01-11T22:08:44Z

@tianleiwu I believe you are missing --use_cuda and I also use onnxruntime_DISABLE_CONTRIB_OPS=ON. To be honest I always build using cmake directly, but I can try tomorrow with the build script.

gedoensmax · 2024-01-11T22:25:19Z

Ok I could not hold back to try. I verified that this works on my end:

./build.sh --config Release  --build_shared_lib --parallel --cuda_version 12.2 \
            --cuda_home $CUDA_HOME --cudnn_home $CUDNN_HOME --build_wheel --skip_tests \
            --use_tensorrt --tensorrt_home $TRT_HOME \
            --cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=OFF \
            --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=89 \
            --cmake_extra_defines onnxruntime_CUDA_MINIMAL=ON \
            --cmake_extra_defines onnxruntime_DISABLE_CONTRIB_OPS=ON \
            --build_dir build_script

tianleiwu · 2024-01-11T23:39:04Z

@gedoensmax, I tried your build command, and it works before the last merge commit, but failed after the merge:

onnxruntime/onnxruntime/core/providers/cuda/cuda_stream_handle.cc:59:1: error: no declaration matches ‘onnxruntime::CudaStream::CudaStream(cudaStream_t, const OrtDevice&, onnxruntime::AllocatorPtr, bool, bool, cudnnHandle_t, cublasHandle_t)’
   59 | CudaStream::CudaStream(cudaStream_t stream,
      | ^~~~~~~~~~

gedoensmax · 2024-01-12T12:54:37Z

Sorry the webui (or me using it) must have messed this up.

tianleiwu · 2024-01-17T05:25:58Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

tianleiwu · 2024-01-17T05:26:15Z

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, ONNX Runtime React Native CI Pipeline, Windows x64 QNN CI Pipeline

tianleiwu · 2024-01-17T05:26:30Z

/azp run Linux MIGraphX CI Pipeline, orttraining-amd-gpu-ci-pipeline

azure-pipelines · 2024-01-17T05:26:41Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2024-01-17T05:26:41Z

Azure Pipelines successfully started running 2 pipeline(s).

azure-pipelines · 2024-01-17T05:26:52Z

Azure Pipelines successfully started running 9 pipeline(s).

tianleiwu · 2024-01-17T17:32:54Z

/azp run orttraining-amd-gpu-ci-pipeline

azure-pipelines · 2024-01-17T17:33:06Z

Azure Pipelines successfully started running 1 pipeline(s).

### Description  New CI: [Linux_TRT_Minimal_CUDA_Test_CI](https://dev.azure.com/onnxruntime/onnxruntime/_build?definitionId=230&_a=summary) and [Win_TRT_Minimal_CUDA_Test_CI ](https://dev.azure.com/onnxruntime/onnxruntime/_build?definitionId=231) Setting config for new CI to monitor if there's no issue to build ORT-TRTEP with minimal CUDA * yaml content is following Linux TRT CI yaml, with different build arg/cache name * build arg is following [[TensorRT EP] Enable a minimal CUDA EP compilation without kernels](#19052 (comment)) ### Motivation and Context  Monitor if user is able to build ORT-TRTEP-minimalCUDA without any blocker (which takes ~30min to build)

introduce an option to make a minimal compile for TRT EP without CUDA…

c1bc3c0

… kernels

Merge branch 'main' into trt_compile_no_cu_ops

a343bb2

gedoensmax force-pushed the trt_compile_no_cu_ops branch from beb86b5 to a343bb2 Compare January 12, 2024 12:53

tianleiwu approved these changes Jan 17, 2024

View reviewed changes

tianleiwu merged commit bc219ed into microsoft:main Jan 17, 2024
72 of 74 checks passed

yf711 added a commit that referenced this pull request Dec 5, 2024

adapt #19052 (comment)

fa4dd84

yf711 mentioned this pull request Dec 5, 2024

[TensorRT EP] New CIs to test TRT+minimal CUDA build #23028

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TensorRT EP] Enable a minimal CUDA EP compilation without kernels #19052

[TensorRT EP] Enable a minimal CUDA EP compilation without kernels #19052

gedoensmax commented Jan 8, 2024

tianleiwu commented Jan 11, 2024

tianleiwu commented Jan 11, 2024

tianleiwu commented Jan 11, 2024

azure-pipelines bot commented Jan 11, 2024

azure-pipelines bot commented Jan 11, 2024

azure-pipelines bot commented Jan 11, 2024

tianleiwu commented Jan 11, 2024 •

edited

Loading

gedoensmax commented Jan 11, 2024

gedoensmax commented Jan 11, 2024

tianleiwu commented Jan 11, 2024

gedoensmax commented Jan 12, 2024

tianleiwu commented Jan 17, 2024

tianleiwu commented Jan 17, 2024

tianleiwu commented Jan 17, 2024

azure-pipelines bot commented Jan 17, 2024

azure-pipelines bot commented Jan 17, 2024

azure-pipelines bot commented Jan 17, 2024

tianleiwu commented Jan 17, 2024

azure-pipelines bot commented Jan 17, 2024

[TensorRT EP] Enable a minimal CUDA EP compilation without kernels #19052

[TensorRT EP] Enable a minimal CUDA EP compilation without kernels #19052

Conversation

gedoensmax commented Jan 8, 2024

tianleiwu commented Jan 11, 2024

tianleiwu commented Jan 11, 2024

tianleiwu commented Jan 11, 2024

azure-pipelines bot commented Jan 11, 2024

azure-pipelines bot commented Jan 11, 2024

azure-pipelines bot commented Jan 11, 2024

tianleiwu commented Jan 11, 2024 • edited Loading

gedoensmax commented Jan 11, 2024

gedoensmax commented Jan 11, 2024

tianleiwu commented Jan 11, 2024

gedoensmax commented Jan 12, 2024

tianleiwu commented Jan 17, 2024

tianleiwu commented Jan 17, 2024

tianleiwu commented Jan 17, 2024

azure-pipelines bot commented Jan 17, 2024

azure-pipelines bot commented Jan 17, 2024

azure-pipelines bot commented Jan 17, 2024

tianleiwu commented Jan 17, 2024

azure-pipelines bot commented Jan 17, 2024

tianleiwu commented Jan 11, 2024 •

edited

Loading