Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build-notes - summit #6

Open
naughtont3 opened this issue Feb 26, 2019 · 5 comments
Open

build-notes - summit #6

naughtont3 opened this issue Feb 26, 2019 · 5 comments

Comments

@naughtont3
Copy link
Owner

Building TAU + OMPI-instr-master on SUMMIT.

This is to track the build notes for tests with PR #5 on Summit

@naughtont3
Copy link
Owner Author

naughtont3 commented Feb 26, 2019

Build note - On summit, I used external hwloc for OMPI and TAU to resolve dependency on -lhwloc needed by libunwind with TAU.

Note - But i still ended up needing to hack a TAU Makefile. Ultimately, I believe I could have just removed the -lhwloc from TAU's include/Makefile for the TAU_MVAPICH2_LINKER_OPTIONS option.

Here's the manually modified line I used...

login3:$ grep TAU_MVAPICH2_LINKER_OPTIONS= include/Makefile
#TAU_MVAPICH2_LINKER_OPTIONS=-lhwloc #ENDIF##IBM64LINUX_XLC#
TAU_MVAPICH2_LINKER_OPTIONS=

@naughtont3
Copy link
Owner Author

Add tau_exec to the PATH...

export PATH=$TAU_INSTALL_DIR/ibm64linux/bin:$PATH

@naughtont3
Copy link
Owner Author

naughtont3 commented Mar 7, 2019

More notes:

  • Remember to include --enable-mem-profile in OMPI configure

TODO:

  • Figure out why we fail with more than 2 processes and get the TAU error msgs (see TauMemory.cpp:Tau_stop_class_allocation) like this...
 ERROR: Overlapping allocations. Found pmix4x_opcaddy_t but opal_cleanup_fn_item_t expected.

This looks like an interleaving things, as the backtrace shows we are failing during MPI_Init.
Somewhere in/around opal_datatype_init.

Note things seem to work for -np 2 but fail at -np 4 (e.g., with osu hello, etc.).

@naughtont3
Copy link
Owner Author

Looks like I did not build TAU with multithreading and the latest OMPI is more multi-threaded during init. So the fix appears to be:

    1. Configure TAU with -pthread
    1. Add pthread to the tau_exec tag list
  • NOTE: The profile.*.0.0 will be the main threads of our ranks.

@naughtont3
Copy link
Owner Author

naughtont3 commented Mar 13, 2019

Summit Build Summary:

   ./configure \
       --enable-mem-profile \ 
       --enable-mpi1-compatibility \
       --with-hwloc=/ccs/home/naughton/projects.summit/ompix/mem-scale/source/hwloc-2.0.3/_install \
      --enable-orterun-prefix-by-default \
      --prefix=/gpfs/alpine/proj-shared/stf010/naughton/summit/ompix/install \
      --with-memory-manager=none \
      --disable-vt \
   && make -j 4 \
   && make install
  • Update environment (PATH and LD_LIBRARY_PATH)

  • Tau

   cd source/

   cd pdtoolkit-3.25
   export PDT_SRC_DIR=$PWD
   ./configure && \
   make && \
   make install
     # make sure mpicc in PATH
    cd source/
    cd tau-2.27.1/
    export TAU_INSTALL_DIR=$PWD
   ./configure \
        -pthread \
        -mpi \
        -bfd=download  \
        -unwind=download \
        -pdt=$PDT_SRC_DIR \
        -prefix=$TAU_INSTALL_DIR
     # Manually remove `-lhwloc` from `TAU_MVAPICH2_LINKER_OPTIONS`
   vi include/Makefile
   make -j 4 && make install
  • Update environment (PATH and LD_LIBRARY_PATH)
summit:$ cat env_tau.sh

TAU_INSTALL_DIR=/gpfs/alpine/proj-shared/stf010/naughton/summit/ompix/tau/tau-2.27.1/_install
export PATH=$TAU_INSTALL_DIR/ibm64linux/bin:$PATH

summit:$ 
  • Usage

      mpirun -np 4   tau_exec -T mpi,pdt,pthread   ../ring_c
        # the 4 Ranks (main threads)
      ls profile.*.0.0
    

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant