Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] AllocatorTest.CUDAAllocatorFallbackTest failed #21343

Open
tianleiwu opened this issue Jul 12, 2024 · 2 comments
Open

[Build] AllocatorTest.CUDAAllocatorFallbackTest failed #21343

tianleiwu opened this issue Jul 12, 2024 · 2 comments
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider stale issues that have not been addressed in a while; categorized by a bot

Comments

@tianleiwu
Copy link
Contributor

Describe the issue

Unit Test failed in A100 80GB

Urgency

No response

Target platform

ubuntu 20.04

Build script

sh build.sh --config Release --build_dir build/cuda12 --build_shared_lib --parallel --use_cuda
--cuda_version 12.5 --cuda_home /data/cuda12
--cudnn_home /data/cuda12/
--build_wheel --skip_tests
--cmake_generator Ninja
--cmake_extra_defines onnxruntime_BUILD_UNIT_TESTS=ON CMAKE_CUDA_ARCHITECTURES=80 onnxruntime_ENABLE_CUDA_EP_INTERNAL_TESTS=ON

Error / output

onnxruntime/build/cuda12/Release$ ./onnxruntime_test_all --gtest_filter=CUDA_EP_Unittest*

[ RUN ] AllocatorTest.CUDAAllocatorFallbackTest
unknown file: Failure
C++ exception with description "/home/tlwu/onnxruntime/onnxruntime/core/framework/bfc_arena.cc:376 void* onnxruntime::BFCArena::AllocateRawInternal(size_t, bool, onnxruntime::Stream*, bool, onnxruntime::WaitNotificationFn) Failed to allocate memory for requested buffer of size 42487119872
" thrown in the test body.

[ FAILED ] AllocatorTest.CUDAAllocatorFallbackTest (3 ms)

Visual Studio Version

No response

GCC / Compiler Version

11.4

@tianleiwu tianleiwu added the build build issues; typically submitted using template label Jul 12, 2024
@tianleiwu
Copy link
Contributor Author

tianleiwu commented Jul 12, 2024

Test will fail if like 80% GPU memory has been used by other process. It seems that the test has some assumption that might not always pass.

@tianleiwu tianleiwu changed the title [Build] AllocatorTest.CUDAAllocatorFallbackTest failed in A100 80GB [Build] AllocatorTest.CUDAAllocatorFallbackTest failed Jul 12, 2024
@sophies927 sophies927 added the ep:CUDA issues related to the CUDA execution provider label Jul 18, 2024
Copy link
Contributor

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

@github-actions github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label Aug 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider stale issues that have not been addressed in a while; categorized by a bot
Projects
None yet
Development

No branches or pull requests

2 participants