-
Notifications
You must be signed in to change notification settings - Fork 35
NVHPC hackathon
TODO: add to scripts/env/saturn.sh and add cmake presets done
module load cmake nvhpc
export PATH=/storage/users/s3j/spack/var/spack/environments/tools/.spack-env/view/bin:$PATH
source /home/users/s3j/spack/share/spack/setup-env.sh
spack env activate celeritas-nvhpc
Config:
config:
install_tree:
root: $spack/opt/spack
projections:
all: "{architecture}/{name}/{version}/{hash:7}"
deprecated: true
build_jobs: 26
Find externals:
$ module load cmake cuda nvhpc
$ spack external find --scope=site
==> The following specs have been detected on this system and added to /storage/users/s3j/spack/etc/spack/packages.yaml
[email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
[email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
[email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected]
$ spack compiler find --scope=site
==> Added 3 new compilers to /storage/users/s3j/spack/etc/spack/compilers.yaml
[email protected] [email protected] [email protected]
==> Compilers are defined in the following files:
/storage/users/s3j/spack/etc/spack/compilers.yaml
(also module load llvm
and spack external find llvm python
, and run ./scripts/dev/install-commit-hooks.sh
)
Add the spack environment and trim it down:
$ cat spack.yaml
spack:
specs:
- geant4@11 %nvhpc
- googletest %nvhpc
- hepmc3 %nvhpc
- nlohmann-json %nvhpc
view: true
concretizer:
unify: true
packages:
xerces-c:
variants: netaccessor=none
all:
compiler: [nvhpc]
providers:
blas: [openblas]
lapack: [openblas]
variants: cxxstd=17
- The manual
%nvhpc
is due to a spack bug reported in https://github.com/spack/spack/issues/32170 - Use https://github.com/spack/spack/pull/32171 for google test cxx standard
Append /storage/users/s3j/spack/share/spack/modules/linux-ubuntu20.04-cascadelake
to MODULEPATH
, run spack module tcl refresh -y
to regenerate.
Check out geant4 11.0.2
$ module load xerces-c-3.2.3-nvhpc-22.5-q6gsmka expat-2.4.8-nvhpc-22.5-3fw46xw geant4-data-11.0.0-nvhpc-22.5-thqh346
$ mkdir -p /home/users/s3j/geant4/share/Geant4-11.0.2/
$ ln -s /storage/users/s3j/spack/opt/spack/linux-ubuntu20.04-cascadelake/geant4-data/11.0.0/thqh346/share/geant4-data-11.0.0 /home/users/s3j/geant4/share/Geant4-11.0.2/data
$ initial-cmake.sh -f -b /storage/warpspeed/scratch/s3j/build-geant4 -p /home/users/s3j/geant4 . -DGEANT4_USE_GDML:BOOL=ON
+ exec cmake -G Ninja -DCMAKE_INSTALL_PREFIX:PATH=/home/users/s3j/geant4 -DCMAKE_EXPORT_COMPILE_COMMANDS:BOOL=ON -DGEANT4_USE_GDML:BOOL=ON /home/users/s3j/.local/src/geant4/.
...
Bad thread-local attributes:
"/tmp/s3j/spack-stage/spack-stage-geant4-11.0.2-7dcvmghd6n6rnnglmgvigabdhifmk2cs/spack-src/source/processes/electromagnetic/dna/management/include/G4Octree.icc", line 35: error: thread-local declaration follows non-thread-local declaration (declared at line 163 of "/tmp/s3j/spack-stage/spack-stage-geant4-11.0.2-7dcvmghd6n6rnnglmgvigabdhifmk2cs/spack-src/source/processes/electromagnetic/dna/management/include/G4Octree.hh")
G4ThreadLocal G4Allocator<OCTREE>* OCTREE::fgAllocator = nullptr;
^
"/tmp/s3j/spack-stage/spack-stage-geant4-11.0.2-7dcvmghd6n6rnnglmgvigabdhifmk2cs/spack-src/source/processes/electromagnetic/dna/management/include/G4Octree.icc", line 35: error: thread-local storage class is not valid here
G4ThreadLocal G4Allocator<OCTREE>* OCTREE::fgAllocator = nullptr;
Worked around by disabling threads.
Excessive recursion from a questionable hack to add "extensible enums":
"/tmp/s3j/spack-stage/spack-stage-geant4-11.0.2-7dcvmghd6n6rnnglmgvigabdhifmk2cs/spack-src/source/processes/electromagnetic/dna/management/include/G4CTCounter.hh", line 79: error: excessive recursion at instantiation of class "G4Number<191>"
struct G4Number: public G4Number<N-1>{
^
detected during:
instantiation of class "G4Number<N> [with N=192]" at line 79
instantiation of class "G4Number<N> [with N=193]" at line 79
instantiation of class "G4Number<N> [with N=194]" at line 79
instantiation of class "G4Number<N> [with N=195]" at line 79
instantiation of class "G4Number<N> [with N=196]" at line 79
[ 54 instantiation contexts not shown ]
instantiation of class "G4Number<N> [with N=251]" at line 79
instantiation of class "G4Number<N> [with N=252]" at line 79
instantiation of class "G4Number<N> [with N=253]" at line 79
instantiation of class "G4Number<N> [with N=254]" at line 79
instantiation of class "G4Number<N> [with N=255]" at line 79 of "/tmp/s3j/spack-stage/spack-stage-geant4-11.0.2-7dcvmghd6n6rnnglmgvigabdhifmk2cs/spack-src/source/processes/electromagnetic/dna/molecules/management/include/G4VMolecularDissociationDisplacer.hh"
Fixed by adding -Wc,--pending_instantiations
See https://github.com/spack/spack/pull/32185
[ FAILED ] Device failed to initialize: /home/users/s3j/.local/src/celeritas/src/corecel/sys/Device.cc:237:
celeritas: cuda error: the provided PTX was compiled with an unsupported toolchain.
cudaFree(nullptr)
(this was because I had the wrong arch and had CUDA enabled)
The tests were just a bit too picky.
In that test UniformGridData::from_bounds(log(1e1.0, log(1e5), 6);
the delta ends up between approximately log(10) but on one compiler (pgi) it is
2.3025850929940455 while on another compiler (gcc) we get
2.3025850929940459 (final bit is different by one).
see 2e04478ea9831b5222d6ac53374f333d1cfa7677
Changing all the OpenMP loops to use std::for_each(std::execution::par_unseq, ...)
gives a compiler error on InitTracks.cc:
[26/375] Building CXX object src/CMakeFiles/celeritas.dir/celeritas/track/generated/InitTracks.cc.o
FAILED: src/CMakeFiles/celeritas.dir/celeritas/track/generated/InitTracks.cc.o
/packages/nvhpc/22.5_cuda11.7/Linux_x86_64/22.5/compilers/bin/nvc++ -Dceleritas_EXPORTS -I/home/users/romano/celeritas/src -I/storage/warpspeed/scratch/romano/build-base/src -isystem /storage/packages/nvhpc/22.5_cuda11.7/Linux_x86_64/22.5/cuda/11.7/include -isystem /home/users/s3j/spack/var/spack/environments/celeritas-nvhpc/.spack-env/view/include -stdpar -Minfo -g -O0 -fPIC --c++14 -MD -MT src/CMakeFiles/celeritas.dir/celeritas/track/generated/InitTracks.cc.o -MF src/CMakeFiles/celeritas.dir/celeritas/track/generated/InitTracks.cc.o.d -o src/CMakeFiles/celeritas.dir/celeritas/track/generated/InitTracks.cc.o -c /home/users/romano/celeritas/src/celeritas/track/generated/InitTracks.cc
celeritas::generated::init_tracks(const celeritas::CoreRef<(celeritas::MemSpace)0> &, const celeritas::TrackInitStateData<(celeritas::Ownership)1, (celeritas::MemSpace)0> &, unsigned int):
31, stdpar: Generating NVIDIA GPU code
31, std::for_each with std::execution::par_unseq policy parallelized on GPU
NVC++-F-0155-Compiler failed to translate accelerator region (see -Minfo messages): Unsupported procedure (/home/users/romano/celeritas/src/celeritas/track/generated/InitTracks.cc: 1)
NVC++/x86-64 Linux 22.5-0: compilation aborted