Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while linking on Tioga (AMD) with GasNET #1555

Closed
Tracked by #1032
mariodirenzo opened this issue Sep 28, 2023 · 7 comments
Closed
Tracked by #1032

Error while linking on Tioga (AMD) with GasNET #1555

mariodirenzo opened this issue Sep 28, 2023 · 7 comments

Comments

@mariodirenzo
Copy link

I am working on porting HTR on Tioga (https://hpc.llnl.gov/hardware/compute-platforms/tioga) and, as soon as I try to link my any application with Legion, I get errors.
I am using CMake to build Legion and I've tryied two options without success

Option 1: Using the embedded GasNet:

I configure legion with the following command line

cmake -DLegion_USE_GASNet=ON -DLegion_USE_HIP=ON -DLegion_USE_HDF=ON -DLegion_USE_OpenMP=ON -DLegion_EMBED_GASNet=ON -DGASNet_CONDUIT=ofi -DGASNet_SYSTEM=slingshot11 .

I build it with

make -j

Then I try to test the installation building circuit. I configure circuit in the folder examples/circuit/ with the following command

Legion_DIR=$LEGION_DIR cmake -DCMAKE_HIP_ARCHITECTURES=gfx90a .

and when I try to make it with make -j, I get the error

make[2]: *** No rule to make target '/usr/local/lib64/libgasnet-ofi-par.a', needed by 'circuit'.  Stop.

Meaning that it is pointing to the wrong path to find gasnet.

Option 2: Forcing the path to GasNet:

Considering that with the previous attempt the cmake system has already built a gasnet in the folder /g/g92/direnzo1/legion_tioga/embed-gasnet/install/, I try to force to path to gasnet by building Legion with

cmake -DLegion_USE_GASNet=ON -DLegion_USE_HIP=ON -DLegion_USE_HDF=ON -DLegion_USE_OpenMP=ON -DGASNet_ROOT_DIR=/g/g92/direnzo1/legion_tioga/embed-gasnet/install -DGASNet_CONDUIT=ofi -DGASNet_SYSTEM=slingshot11 .

In this case, if I try to build circuit with the same commands above, I get the following errors:

ld.lld: error: undefined symbol: MPI_Initialized
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(gasneti_bootstrapInit_mpi) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a
>>> did you mean: PMI_Initialized
>>> defined in: /opt/cray/pe/pmi/default/lib/libpmi2.so

ld.lld: error: undefined symbol: MPI_Query_thread
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(gasneti_bootstrapInit_mpi) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Init_thread
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(gasneti_bootstrapInit_mpi) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Comm_group
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(gasneti_bootstrapInit_mpi) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Comm_create
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(gasneti_bootstrapInit_mpi) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Group_free
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(gasneti_bootstrapInit_mpi) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Comm_size
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(gasneti_bootstrapInit_mpi) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Comm_rank
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(gasneti_bootstrapInit_mpi) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Finalized
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapFini) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Comm_free
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapFini) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Finalize
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapFini) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Abort
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapAbort) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Barrier
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapBarrier) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Allgather
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapExchange) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Alltoall
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapAlltoall) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Bcast
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapBroadcast) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Isend
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapSNodeBroadcast) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Waitall
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapSNodeBroadcast) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

ld.lld: error: undefined symbol: MPI_Recv
>>> referenced by gasnet_bootstrap_mpi.c
>>>               gasnet_bootstrap_mpi-PAR.o:(bootstrapSNodeBroadcast) in archive /g/g92/direnzo1/legion_tioga/embed-gasnet/install/lib/libgasnet-ofi-par.a

I am not expert in the cmake system to build legion but it seems that the dependecies of GasNet are not correctly propagated.
I was wondering if you have any advise?

NOTE: if I deactivate the support of GasNET, the codes link fine.

@mariodirenzo
Copy link
Author

Note that the following patch fixes the issue using the second option of compiling

diff --git a/cmake/FindGASNet.cmake b/cmake/FindGASNet.cmake
index 0ba7ba32d..2142c8b37 100644
--- a/cmake/FindGASNet.cmake
+++ b/cmake/FindGASNet.cmake
@@ -71,7 +71,9 @@ gasnet-ld:
gasnet-ldflags:
       @echo $(GASNET_LDFLAGS)
gasnet-libs:
-       @echo $(GASNET_LIBS)"
+       @echo $(GASNET_LIBS)
+gasnet-ld-reuires-mpi:
+       @echo $(GASNET_LD_REQUIRES_MPI)"
  )
  find_program(GASNet_MAKE_PROGRAM NAMES gmake make smake)
  mark_as_advanced(GASNet_MAKE_PROGRAM)
@@ -106,6 +108,12 @@ gasnet-libs:
      ERROR_VARIABLE _GASNet_LIBS_ERROR
      OUTPUT_STRIP_TRAILING_WHITESPACE
    )
+    execute_process(
+      COMMAND ${GASNet_MAKE_PROGRAM} -s -f ${_TEMP_MAKEFILE} gasnet-ld-reuires-mpi
+      OUTPUT_VARIABLE _GASNet_LD_REQUIRES_MPI
+      ERROR_VARIABLE _GASNet_LD_REQUIRES_MPI_ERROR
+      OUTPUT_STRIP_TRAILING_WHITESPACE
+    )
    file(REMOVE ${_TEMP_MAKEFILE})
  endif()
endmacro()
@@ -154,6 +162,10 @@ function(_GASNet_create_component_target _GASNet_MAKEFILE COMPONENT_NAME)
    endif()
    mark_as_advanced(GASNet_${L}_LIBRARY)
  endforeach()
+  if(_GASNet_LD_REQUIRES_MPI)
+     find_package(MPI REQUIRED COMPONENTS C)
+     list(APPEND COMPONENT_DEPS ${MPI_C_LIBRARIES})
+  endif()
  if(_GASNet_LD MATCHES "^(/.*/)?mpi[^/]*" AND NOT (_GASNet_LD STREQUAL CMAKE_C_COMPILER))
    set(MPI_C_COMPILER ${_GASNet_LD})
    find_package(MPI REQUIRED COMPONENTS C)

@mariodirenzo
Copy link
Author

@elliottslaughter, please add this issue to #1032

@elliottslaughter
Copy link
Contributor

Could you share your values of CC and CXX? Or maybe if possible, upload the CMakeCaches.txt.

@mariodirenzo
Copy link
Author

The CMakeCaches.txt of an installation generated with option 1 is
CMakeCache_badOpt1.txt

The CMakeCaches.txt of an installation generated with option 2 without the patch above is
CMakeCache_badOpt2.txt

The CMakeCaches.txt of a fully functional installation generated with option 2 and the patch above is
CMakeCache_goodOpt2.txt

The failure with option 1 is somehow related to this line of code, though I haven't been able to get a good installation
https://gitlab.com/StanfordLegion/legion/-/blob/master/runtime/CMakeLists.txt#L472

@elliottslaughter
Copy link
Contributor

I posted your suggested patch with minor modifications here:

https://gitlab.com/StanfordLegion/legion/-/merge_requests/1362

@elliottslaughter
Copy link
Contributor

I merged the patch into master, @mariodirenzo please confirm that the issue is fixed.

@mariodirenzo
Copy link
Author

It works, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants