Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] Unable to build ONNX Runtime against CUDA 12.5 #20953

Closed
mc-nv opened this issue Jun 6, 2024 · 8 comments
Closed

[Build] Unable to build ONNX Runtime against CUDA 12.5 #20953

mc-nv opened this issue Jun 6, 2024 · 8 comments
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform

Comments

@mc-nv
Copy link
Contributor

mc-nv commented Jun 6, 2024

Describe the issue

Unable to build ONNX Runtime against CUDA 12.5.

Urgency

It's quiet important as it may impact Triton 24.06 scope.

Target platform

Windows

Build branch

rel-1.18.0

Build script

./build.sh --config Release --skip_submodule_sync --parallel --build_shared_lib     --build_dir /workspace/build --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES='60;61;70;75;80;86;90'  --update --build --use_cuda --cuda_home "/usr/local/cuda" --cudnn_home "/usr/local/cudnn-9.1/cuda" --use_tensorrt --use_tensorrt_builtin_parser --tensorrt_home "/usr/src/tensorrt" --allow_running_as_root

Error / output

#16 98.94 [ 73%] Building CXX object CMakeFiles/onnxruntime_providers.dir/workspace/onnxruntime/onnxruntime/core/providers/cpu/ml/scaler.cc.o
#16 99.04 /workspace/onnxruntime/onnxruntime/contrib_ops/cuda/moe/ft_moe/moe_kernel.cu(68): error: identifier "FLT_MAX" is undefined
#16 99.04     float threadData(-FLT_MAX);
#16 99.04                       ^
#16 99.04 
#16 99.12 1 error detected in the compilation of "/workspace/onnxruntime/onnxruntime/contrib_ops/cuda/moe/ft_moe/moe_kernel.cu".

Visual Studio Version

No response

GCC / Compiler Version

No response

@mc-nv mc-nv added the build build issues; typically submitted using template label Jun 6, 2024
@github-actions github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform labels Jun 6, 2024
@mc-nv mc-nv changed the title [Build] [Build] Unable to build ONNX Runtime against CUDA 12.5 Jun 6, 2024
@mc-nv
Copy link
Contributor Author

mc-nv commented Jun 6, 2024

cc: @pranavsharma

@chilo-ms
Copy link
Contributor

chilo-ms commented Jun 6, 2024

+@tianleiwu

@yf711
Copy link
Contributor

yf711 commented Jun 7, 2024

Hi @mc-nv Is your test branch including this commit? I tried building latest main with cuda 12.5 and didn't repro this issue

@mc-nv
Copy link
Contributor Author

mc-nv commented Jun 7, 2024

Is your test branch including this commit? I tried building latest main with cuda 12.5 and didn't repro this issue

@yf711 my build is based on rel-1.18.0

@yf711
Copy link
Contributor

yf711 commented Jun 7, 2024

Is your test branch including this commit? I tried building latest main with cuda 12.5 and didn't repro this issue

@yf711 my build is based on rel-1.18.0

I see. rel-1.18.1 will include this fix and support cuda 12.5, which is planning to be released on 6/17

@mc-nv
Copy link
Contributor Author

mc-nv commented Jun 7, 2024

Is your test branch including this commit? I tried building latest main with cuda 12.5 and didn't repro this issue

@yf711 my build is based on rel-1.18.0

I see. rel-1.18.1 will include this fix and support cuda 12.5, which is planning to be released on 6/17

Can we have that branch available?
I don't see any 1.18.1 changes in the working tree to be able refer to.
https://github.com/microsoft/onnxruntime/branches/all?query=1.18

@yf711
Copy link
Contributor

yf711 commented Jun 7, 2024

Is your test branch including this commit? I tried building latest main with cuda 12.5 and didn't repro this issue

@yf711 my build is based on rel-1.18.0

I see. rel-1.18.1 will include this fix and support cuda 12.5, which is planning to be released on 6/17

Can we have that branch available? I don't see any 1.18.1 changes in the working tree to be able refer to. https://github.com/microsoft/onnxruntime/branches/all?query=1.18

The branch will be available for testing by the end of next week (6/14)
and I just get noticed that the scheduled 6/17 release date will be delayed for a bit. Will give update once it's determined

tianleiwu added a commit that referenced this issue Jun 11, 2024
### Description
Upgrade cutlass to 3.5 to fix build errors using CUDA 12.4 or 12.5 in
Windows
- [x] Upgrade cutlass to 3.5.0.
- [x] Fix flash attention build error with latest cutlass header files
and APIs. This fix is provided by @wangyems.
- [x] Update efficient attention to use new cutlass fmha interface.
- [x] Patch cutlass to fix `hrsqrt` not found error for sm < 53.
- [x] Disable TF32 Staged Accumulation to fix blkq4_fp16_gemm_sm80_test
build error for cuda 11.8 to 12.3.
- [x] Disable TRT 10 deprecate warnings. 

The following are not included in this PR:
* TRT provider replaces the deprecated APIs.
* Fix blkq4_fp16_gemm_sm80_test build error for cuda 12.4 or 12.5. This
test is not built by default unless you add `--cmake_extra_defines
onnxruntime_ENABLE_CUDA_EP_INTERNAL_TESTS=ON` in build command.

To integrate to rel-1.18.1: Either bring in other changes (like onnx
1.16.1), or generate manifest and upload a new ONNX Runtime Build Time
Deps artifact based on rel-1.18.1.

### Motivation and Context
#19891
#20924
#20953
@tianleiwu
Copy link
Contributor

Use branch rel-1.18.1, and use --compile_no_warning_as_error in build.sh command line shall fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform
Projects
None yet
Development

No branches or pull requests

4 participants