cmake updates worth considering #9

svenevs · 2022-10-19T21:11:52Z

This project should consider

Stop using the deprecated find_package(CUDA). You only need the cuda runtime for this project from a quick glance (so no need to have project(... LANGUAGES CUDA) since no CUDA source files. The update process may be painful depending on what you were or were not using from find_package(CUDA). You should only need to link against CUDA::cudart from a cursory glance.
Right now a hard dependency on hdf5-shared shows up in the generated cmake config files that are installed. That may be problematic for an application that is consuming hdf5-static? Unclear what the right solution is.
The header file directory is not included in the interface library include directories for the generated cmake config files that are installed -- see NOTE 4 at the bottom of the linked code.

The text was updated successfully, but these errors were encountered:

brtnfld · 2022-10-20T03:32:06Z

We are working on updating CMake; first, we are trying to get it to work with GitHub actions. It looks to me that the installation instructions are not quite right. I've removed the use of find_package(CUDA), but it still has an issue finding/linking the cuda libraries. I'm wondering, though, if we should be using FindCUDAToolkit instead.

Currently, we are experimenting with it at, no successful build yet:
https://github.com/brtnfld/vfd-gds

svenevs · 2022-10-20T04:15:37Z

We are working on updating CMake; first, we are trying to get it to work with GitHub actions.

Always an adventure 🙂 Are CMake changes needed to get them running at all? It could be an environment problem, I don't see anything overtly wrong with the current system -- it's just the old cmake style.

I've removed the use of find_package(CUDA), but it still has an issue finding/linking the cuda libraries.

The process is a bit more involved, partly because you have to adopt some target-based approaches related to the packaging.

From what I can see this library needs to be able to (1) compile and link against the cuda runtime and (2) specifically cufile. In the previous versions of CMake tactics you'd accumulate a list of include directories, libraries to link, and compiler definitions and store them all in a list. Since

vfd-gds/src/CMakeLists.txt

Line 36 in 4906c76

add_library(hdf5_vfd_gds ${HDF5_VFD_GDS_SRCS})

is going to ultimately include / link against

vfd-gds/CMakeLists.txt

Lines 85 to 93 in 4906c76

 set(HDF5_VFD_GDS_EXT_INCLUDE_DEPENDENCIES 

 ${HDF5_VFD_GDS_EXT_INCLUDE_DEPENDENCIES} 

 ${CUDA_INCLUDE_DIRS} 

 ) 

 set(HDF5_VFD_GDS_EXT_LIB_DEPENDENCIES 

 ${HDF5_VFD_GDS_EXT_LIB_DEPENDENCIES} 

 ${CUDA_LIBRARIES} 

 ${HDF5_VFD_GDS_CUFILE_LIB} 

 )

where CUDA_INCLUDE_DIRS and CUDA_LIBRARIES came from find_package(CUDA). So to replace it using the target-based approach, you instead do (untested not on cuda machine)

find_package(CUDAToolkit)
# ... at some point after `add_library` aka this target exists ...
# note: you can `add_library(target-name "")` at the beginning of your CMake with no sources
# so that the target exists immediately and then do `target_sources` later on, mentioning because
# your dependencies search happens before the `add_subdirectory`
# https://cmake.org/cmake/help/latest/module/FindCUDAToolkit.html#cuda-toolkit-rt-lib
target_link_libraries(hdf5_vfd_gds PUBLIC CUDA::cudart)  # or CUDA::cudart_static but less common

Similarly for cufile you can CUDA::cuFile (but this was added in 3.25 so that is your minimum version). If that's too recent you can keep the find_library there but instead of

vfd-gds/CMakeLists.txt

Line 73 in 4906c76

 set (HDF5_VFD_GDS_CUFILE_DIR ${CUDA_TOOLKIT_ROOT_DIR} CACHE PATH "Cufile installation directory for Nvidia GDS support") 

just do it after finding CUDAToolkit and use CUDAToolkit_LIBRARY_DIR for the search path (CUDA_TOOLKIT_ROOT_DIR also comes from find_package(CUDA)). That option (set(...CACHE)) can probably just be removed?

Note that once you start linking against targets in your build system, consumers need it in your config file. find_dependency(CUDAToolkit), more info here.

Hope that helps clarify some!

brtnfld · 2022-10-20T14:20:27Z

Indeed, if I don't use PR #5 CMake changes, it compiles successfully. But when I try to run the tests it fails because it is looking for
libcuda.so.1, but what is installed is:

libOpenCL.so
libOpenCL.so.1
libOpenCL.so.1.0
libOpenCL.so.1.0.0
libcudadevrt.a
libcudart.so
libcudart.so.11.0
libcudart.so.11.8.89
libcudart_static.a
libcufile.so
libcufile.so.0
libcufile.so.1.4.0
libcufile_rdma.so
libcufile_rdma.so.1
libcufile_rdma.so.1.4.0
libcufile_rdma_static.a
libcufile_static.a
libculibos.a

svenevs · 2022-10-21T19:19:10Z

But when I try to run the tests it fails because it is looking for libcuda.so.1

Hmm, that does seem like an installation problem. That's a pretty critical library to be missing (that's the driver api, as opposed to the runtime api cudart).

vfd-gds/.github/workflows/main.yml

Line 25 in 69d0c16

sudo apt-get install -y cuda-nvcc-11-8 libcufile-dev-11-8

I'm not sure about the ubuntu packaging and the -nvcc package specifically, the docs say to install the meta package. I think cuda-11-8 will also be available (worth testing in a docker image rather than debugging package names via GH actions 😉).

Alternatively if you wanted to get fancy you might be able to download the runfile and cache the download rather than reinstalling the cuda packages each time. Or possibly use one of the cuda docker images?

it compiles successfully. But when I try to run the tests it fails

All that said, even if libcuda.so ends up being installed / that is resolved -- can you run GPU / CUDA anything in the GH hosted runners? I don't think they have GPUs attached, and if a GPU is required to actually run the tests you'd need to have a CI setup that has one or host it.

brtnfld · 2022-10-21T19:25:55Z

Right, I don't think it works without a GPU attached. So we decided to leave GitHub actions to test if it builds only.

I'm not aware of any ECP machines which support GPU storage direct. Is there currently a system that E4S will be installed on that supports it that you know of?

svenevs · 2022-10-25T18:16:14Z

I'm not aware of any ECP machines which support GPU storage direct. Is there currently a system that E4S will be installed on that supports it that you know of?

I am so sorry, it turns out this package was more in the "nice to have" category. I mistakenly thought one of the sites did have GDS enabled which is why I got so excited about just stamping a new release w/o any cmake changes 🙃

svenevs mentioned this issue Oct 19, 2022

Cannot compile via spack #8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmake updates worth considering #9

cmake updates worth considering #9

svenevs commented Oct 19, 2022

brtnfld commented Oct 20, 2022

svenevs commented Oct 20, 2022 •

edited

Loading

brtnfld commented Oct 20, 2022

svenevs commented Oct 21, 2022

brtnfld commented Oct 21, 2022

svenevs commented Oct 25, 2022

cmake updates worth considering #9

cmake updates worth considering #9

Comments

svenevs commented Oct 19, 2022

brtnfld commented Oct 20, 2022

svenevs commented Oct 20, 2022 • edited Loading

brtnfld commented Oct 20, 2022

svenevs commented Oct 21, 2022

brtnfld commented Oct 21, 2022

svenevs commented Oct 25, 2022

svenevs commented Oct 20, 2022 •

edited

Loading