-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] new CUDA CI+development Docker container #1162
base: develop
Are you sure you want to change the base?
Conversation
Update to CUDA 12.6, Ubuntu 24.04, and Clang 19.
Ascent cannot be compiled with CUDA 12.4+, because VTK-m does not support it (https://gitlab.kitware.com/vtk/vtk-m/-/issues/790). vtk-m also fails to build with gcc 12+:
|
@pgrete do we need Ascent in the CI container anymore? it doesn't appear to work with either CUDA 12.6 or gcc 12+. |
Given that we still support Ascent (and still in principle interested in using it), it'd be great if we could keep it. |
The Gitlab VTK-m issues suggest it doesn't work with CUDA 12.4 or 12.5, and it failed for me with 12.6. Maybe it works with earlier minor versions. However, 12.6 is the only version that has an Ubuntu 24.04 container. Feel free to push changes to this branch -- I can't work on this more at the moment. |
@pgrete However, everything except Ascent works now, both on x86-64 and arm64. |
I've checked that Ascent builds with CUDA 12.0. So it stops working somewhere between CUDA 12.1-12.4. |
Thanks for tackling the version inconsistencies. |
adios2 and openpmd build now with the version in this PR.
I don't follow your suggestion. Can you explain what you mean by this? Do you want us to download the build_ascent.sh script from the Ascent repo and then apply a patch to it? |
@pgrete Everything builds now if I set
Update: The issue is that when CUDA support for Ascent is enabled, the compiler is OOM killed. I am running Docker with resource settings set to 12 GB RAM and 1 GB swap. This is apparently not enough. The actual error message is:
|
@pgrete I finally got it to build with 15 GB of RAM and 4 GB of swap (for the Docker VM). Can you build it and verify that it works, and if so, upload it to DockerHub? |
I just pushed the changes I had in mind (using the build script from the Ascent src with a small patch) and using fixed version for ADIOS2 and OpenPMD. $ cmake -B build-ascent -DCMAKE_BUILD_TYPE=Release -DMACHINE_VARIANT=cuda-mpi -DMACHINE_CFG=$(pwd)/cmake/machinecfg/CI.cmake -DPARTHENON_ENABLE_ASCENT=ON -DAscent_DIR=/usr/local/ascent-checkout/lib/cmake/ascent
...
-- Building performance tests.
-- Building regression tests.
-- Creating BLT MPI targets...
-- FindMPI Enabled (ENABLE_FIND_MPI == ON)
-- Found MPI_C: /opt/openmpi/lib/libmpi.so (found version "3.1")
-- Found MPI: TRUE (found version "3.1")
-- BLT MPI Compile Flags:
-- BLT MPI Include Paths: /opt/openmpi/include
-- BLT MPI Libraries: /opt/openmpi/lib/libmpi.so
-- BLT MPI Link Flags: SHELL:-Wl,-rpath -Wl,/opt/openmpi/lib -Wl,--enable-new-dtags
-- MPI Executable: /opt/openmpi/bin/mpiexec
-- MPI Num Proc Flag: -n
-- MPI Command Append:
-- Creating BLT CUDA targets...
CMake Error at /usr/share/cmake-3.22/Modules/CMakeDetermineCompilerId.cmake:726 (message):
Compiling the CUDA compiler identification source file
"CMakeCUDACompilerId.cu" failed.
Compiler: /usr/local/cuda/bin/nvcc
Build flags:
Id flags:
--keep;--keep-dir;tmp;-ccbin=/parthenon/external/Kokkos/bin/nvcc_wrapper -v
The output was:
2
#$ _NVVM_BRANCH_=nvvm
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=/usr/local/cuda/bin
#$ _THERE_=/usr/local/cuda/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_DIR_=targets/x86_64-linux
#$ TOP=/usr/local/cuda/bin/..
#$ NVVMIR_LIBRARY_DIR=/usr/local/cuda/bin/../nvvm/libdevice
#$
LD_LIBRARY_PATH=/usr/local/cuda/bin/../lib:/opt/openmpi/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
#$
PATH=/usr/local/cuda/bin/../nvvm/bin:/usr/local/cuda/bin:/opt/openmpi/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
#$ INCLUDES="-I/usr/local/cuda/bin/../targets/x86_64-linux/include"
#$ LIBRARIES= "-L/usr/local/cuda/bin/../targets/x86_64-linux/lib/stubs"
"-L/usr/local/cuda/bin/../targets/x86_64-linux/lib"
#$ CUDAFE_FLAGS=
#$ PTXAS_FLAGS=
#$ rm tmp/a_dlink.reg.c
#$ "/parthenon/external/Kokkos/bin"/nvcc_wrapper -D__CUDA_ARCH_LIST__=520
-E -x c++ -D__CUDACC__ -D__NVCC__
"-I/usr/local/cuda/bin/../targets/x86_64-linux/include"
-D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=0
-D__CUDACC_VER_BUILD__=140 -D__CUDA_API_VER_MAJOR__=12
-D__CUDA_API_VER_MINOR__=0 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include
"cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o
"tmp/CMakeCUDACompilerId.cpp4.ii"
<command-line>: warning: "__CUDA_ARCH_LIST__" redefined
<command-line>: note: this is the location of the previous definition
#$ cudafe++ --c++17 --gnu_version=110400 --display_error_number
--orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name
"/parthenon/build-ascent/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu"
--allow_managed --m64 --parse_templates --gen_c_file_name
"tmp/CMakeCUDACompilerId.cudafe1.cpp" --stub_file_name
"CMakeCUDACompilerId.cudafe1.stub.c" --gen_module_id_file
--module_id_file_name "tmp/CMakeCUDACompilerId.module_id"
"tmp/CMakeCUDACompilerId.cpp4.ii"
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(94):
error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(98):
error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(103):
error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(104):
error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(109):
error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(110):
error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(114):
error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(118):
error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(122):
error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(126):
error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(133):
error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(137):
error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(142):
error: identifier "__match64_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(143):
error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(148):
error: identifier "__match64_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(149):
error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(153):
error: identifier "__match64_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(157):
error: identifier "__match64_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(161):
error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(165):
error: identifier "__match64_all_sync" is undefined
20 errors detected in the compilation of "CMakeCUDACompilerId.cu".
# --error 0x2 --
Call Stack (most recent call first):
/usr/share/cmake-3.22/Modules/CMakeDetermineCompilerId.cmake:6 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
/usr/share/cmake-3.22/Modules/CMakeDetermineCompilerId.cmake:48 (__determine_compiler_id_test)
/usr/share/cmake-3.22/Modules/CMakeDetermineCUDACompiler.cmake:298 (CMAKE_DETERMINE_COMPILER_ID)
/usr/local/ascent-checkout/lib/cmake/ascent/thirdparty/BLTSetupCUDA.cmake:67 (enable_language)
/usr/local/ascent-checkout/lib/cmake/ascent/BLTSetupTargets.cmake:97 (include)
/usr/local/ascent-checkout/lib/cmake/ascent/AscentConfig.cmake:156 (include)
CMakeLists.txt:370 (find_package)
-- Configuring incomplete, errors occurred!
See also "/parthenon/build-ascent/CMakeFiles/CMakeOutput.log".
See also "/parthenon/build-ascent/CMakeFiles/CMakeError.log".
I don't get it... $ cat/parthenon/build-ascent/CMakeFiles/CMakeError.log
#$ "/parthenon/external/Kokkos/bin"/nvcc_wrapper -D__CUDA_ARCH_LIST__=520 -E -x c++ -D__CUDACC__ -D__NVCC__ "-I/usr/local/cuda/bin/../targets/x86_64-linux/include" -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=0 -D__CUDACC_VER_BUILD__=140 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=0 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o "tmp/CMakeCUDACompilerId.cpp4.ii"
<command-line>: warning: "__CUDA_ARCH_LIST__" redefined
<command-line>: note: this is the location of the previous definition
#$ cudafe++ --c++17 --gnu_version=110400 --display_error_number --orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name "/parthenon/build-ascent/CMakeFiles/3.22.1/CompilerIdCUDA/CMakeCUDACompilerId.cu" --allow_managed --m64 --parse_templates --gen_c_file_name "tmp/CMakeCUDACompilerId.cudafe1.cpp" --stub_file_name "CMakeCUDACompilerId.cudafe1.stub.c" --gen_module_id_file --module_id_file_name "tmp/CMakeCUDACompilerId.module_id" "tmp/CMakeCUDACompilerId.cpp4.ii"
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(94): error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(98): error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(103): error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(104): error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(109): error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(110): error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(114): error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(118): error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(122): error: identifier "__match32_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(126): error: identifier "__match64_any_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(133): error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(137): error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(142): error: identifier "__match64_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(143): error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(148): error: identifier "__match64_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(149): error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(153): error: identifier "__match64_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(157): error: identifier "__match64_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(161): error: identifier "__match32_all_sync" is undefined
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/sm_70_rt.hpp(165): error: identifier "__match64_all_sync" is undefined
20 errors detected in the compilation of "CMakeCUDACompilerId.cu".
# --error 0x2 -- |
Just wondering, why is the CUDA_ARCH set to Maybe a CUDA 12.0 bug? |
One change I wanted to make was to have it run insider the container by default as a non-root user. This avoids the OpenMPI warnings about running as root user, and also prevents container users from accidentally destroying system packages, or the pre-installed dependencies. |
Since Parthenon builds without Ascent support enabled in the container, I think this is an Ascent (or maybe BLT) bug. I have no interest in Ascent support, so I can't debug this any further. |
I wonder if it is possible to reproduce this with the BLT CalcPi tutorial: https://github.com/LLNL/blt/tree/develop/docs/tutorial/calc_pi |
I can't reproduce with BLT alone. It also doesn't work on CUDA 12.1 for me. My guess is that this is an Ascent bug introduced in the past few months that only manifests for Kokkos apps. |
I think this is just what cmake uses to check the cuda compiler because it is the default in |
I'm going to ask the Ascent devs. |
PR Summary
Updates the CUDA CI image to CUDA 12.1, Ubuntu 22.04, and Clang 19.
Adds a
.devcontainer/devcontainer.json
file to support GitHub Codespaces and VSCode Dev Containers out of the box.The old container was based on an image that was no longer supported upstream. Unfortunately, the new image is also deprecated, since we have to use an old CUDA 12 minor version. (Ascent cannot be built with CUDA 12.4+ due to a lack of VTK-m support for those versions.)
This container should be built as a multi-arch image for both x86_64 and ARM support, e.g.:
NOTE: It must be built in a VM with a large amount of memory. I was not able to build the container image without 15 GB RAM and 4 GB swap for the Docker VM (otherwise, the VTK-m build will be killed by the OOM killer).
PR Checklist