Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build]try to build docker image, failed at 81% and the whole system crashed thrice. Have to push the power button every time to force restart #18010

Closed
yongjer opened this issue Oct 18, 2023 · 1 comment
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.

Comments

@yongjer
Copy link

yongjer commented Oct 18, 2023

Describe the issue

20231018_101027

Urgency

No response

Target platform

host: ubuntu20.04.6 cuda:12.2 driver version:535.104.12

Build script

official tensorrt dockerfile at
https://github.com/microsoft/onnxruntime/blob/main/dockerfiles/Dockerfile.tensorrt

Error / output

[81] Building CUDA object CMakeFiles/onnxruntime providers cuda.dir/
code/onnruntime/onnxruntime/contrib_ops/cuda/bert/flash_attention/fla
sh_fwd_split_hdim224_fp16_sm80. cu. o

Visual Studio Version

vscode: 1.83.1

GCC / Compiler Version

gcc: Ubuntu 9.4.0-1ubuntu1~20.04.2

@yongjer yongjer added the build build issues; typically submitted using template label Oct 18, 2023
@github-actions github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. labels Oct 18, 2023
@yongjer yongjer changed the title [Build] build docker image failed at 81% and the whole system crashed thrice. Have to push the power button to force restart [Build]try to build docker image, failed at 81% and the whole system crashed thrice. Have to push the power button to force restart Oct 18, 2023
@yongjer yongjer changed the title [Build]try to build docker image, failed at 81% and the whole system crashed thrice. Have to push the power button to force restart [Build]try to build docker image, failed at 81% and the whole system crashed thrice. Have to push the power button every time to force restart Oct 18, 2023
@snnn
Copy link
Member

snnn commented Oct 18, 2023

It might have run out of memory. Please try to get a more powerful machine. Like, a machine with 64GB memory. Also, you may add "--nvcc_threads=1" to args of build.sh in https://github.com/microsoft/onnxruntime/blob/main/dockerfiles/Dockerfile.tensorrt, it might help.

@snnn snnn closed this as completed Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.
Projects
None yet
Development

No branches or pull requests

2 participants