-
Notifications
You must be signed in to change notification settings - Fork 35
OpenMP offload
Welcome to the miniqmc for OpenMP offload wiki!
Check out OMP_offload
branch
git co OMP_offload
See build options in miniQMC How-to Guides.
We introduce a new option ENABLE_OFFLOAD
in the current CMake setting to turn on/off offloading.
-DENABLE_OFFLOAD=1 # offload to accelerators like GPU
-DENABLE_OFFLOAD=0 # default, offload to CPU host
OFFLOAD_TARGET
can be used to select a offload target if multiple targets are supported by the compiler, for example Clang and GNU.
Offload feature is currently implemented on miniqmc
miniapp.
It accepts command line arguments -g, -w, -a, -m, -n
-g adjusts supercell size
-w number of walkers. Equal to the number of CPU threads if not specified.
-a tiling (cache blocking) size. Equal to the number of splines if not specified.
-m spline mesh "px py pz"
-n number of iterations
The old check_spo is renamed as check_spo_batched. The following option is only available with check_spo_batched
-f avoid transfer back data for checking. Must be used when measuring performance.
OMP_NUM_THREADS=10 ./bin/miniqmc -g "2 2 1"
Update on Nov 17th 2019
Last verified on 16.1.1-5
cmake -DCMAKE_CXX_COMPILER=xlC_r -DENABLE_OFFLOAD=1 ..
With old version of CMake (<3.11), XL is identified as Clang. The following workaround solves the issue
cmake -DCMAKE_CXX_COMPILER=xlC_r -DCMAKE_CXX_COMPILER_ID='XL' -DENABLE_OFFLOAD=1 ..
Last verified on 11-RC2
cmake -D CMAKE_CXX_COMPILER=clang++ -D ENABLE_OFFLOAD=1 -D USE_OBJECT_TARGET=ON ..
-D USE_OBJECT_TARGET=ON
is used to workaround static linking issue.
Last verified on beta08
cmake -D CMAKE_CXX_COMPILER=icpx -D ENABLE_OFFLOAD=1 -D OFFLOAD_TARGET=spir64 -DCMAKE_EXE_LINKER_FLAGS="-device-math-lib=fp64,fp32" ..
On some systems, forcing LIBOMPTARGET_PLUGIN=OPENCL is needed at runtime.
Last verified on 11.8
cmake -D CMAKE_CXX_COMPILER=clang++ \
-D ENABLE_OFFLOAD=1 \
-D OFFLOAD_TARGET=amdgcn-amd-amdhsa \
-D OFFLOAD_ARCH=gfx906 ..
Last verified on 9.2
cmake -D CMAKE_CXX_COMPILER=g++ -D ENABLE_OFFLOAD=1 ..
Last verified on 9.0
module load craype-accel-nvidia60
cmake -D CMAKE_CXX_COMPILER=CC -DQMC_MPI=1 ..
cmake -DCMAKE_CXX_COMPILER=nvc++ -DENABLE_OFFLOAD=ON -DOFFLOAD_ARCH=cc80 -DQMC_MIXED_PRECISION=ON ..
The compiler is buggy map(always)
Compiler | Clang 12.0.0rc3 | AOMP 11.12-0 | XL 16.1.1-5 | OneAPI 2021.2.0 | Cray 11.0.2 | GCC 10.2 | NVHPC 21.02 |
---|---|---|---|---|---|---|---|
device | NVIDIA | AMD | NVIDIA | Intel | NVIDIA | NVIDIA | NVIDIA |
math header conflict | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
complex arithmetic | Pass | Pass | Pass | Pass | Fail | Pass | Pass |
math linker error | Pass | Pass | Pass | Pass | Fail | Pass | Fail |
static linking | Fail | Pass | Pass | Pass | Pass | Pass | Pass |
Async tasking | Pass | FC | Pass | FC | FC | Fail | Fail |
multiple streams | Pass | Pass | Pass | FC | FC | Pass | Pass |
check_spo | Pass | Pass | Pass | Pass(R) | Pass | Pass | Fail |
check_spo_batched | Pass | Pass | Pass | Pass(R) | Pass | Pass | Fail |
miniqmc_sync_move | Pass | Pass | Pass | Pass | Pass | Pass | Pass |
Pass the intended feature is supported and runs corrected.
Fail can be in compile, link and run or incorrect results.
FC functionally correct, run with correct results.
(R) regression in the current release.
- Cray 9.1 inherits Clang 9 math function issues.
- GNU 9.2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92553