-
Notifications
You must be signed in to change notification settings - Fork 35
OpenMP offload
Welcome to the miniqmc for OpenMP offload wiki!
Check out OMP_offload
branch
git co OMP_offload
See build options in miniQMC How-to Guides.
We introduce a new option ENABLE_OFFLOAD
in the current CMake setting to turn on/off offloading.
-DENABLE_OFFLOAD=1 # offload to accelerators like GPU
-DENABLE_OFFLOAD=0 # default, offload to CPU host
OFFLOAD_TARGET
can be used to select a offload target if multiple targets are supported by the compiler, for example Clang and GNU.
Offload feature is currently implemented on miniqmc
miniapp.
It accepts command line arguments -g, -w, -a, -m, -n
-g adjusts supercell size
-w number of walkers. Equal to the number of CPU threads if not specified.
-a tiling (cache blocking) size. Equal to the number of splines if not specified.
-m spline mesh "px py pz"
-n number of iterations
The old check_spo is renamed as check_spo_batched. The following option is only available with check_spo_batched
-f avoid transfer back data for checking. Must be used when measuring performance.
OMP_NUM_THREADS=10 ./bin/miniqmc -g "2 2 1"
Update on Nov 17th 2019
cmake -DCMAKE_CXX_COMPILER=xlC_r -DENABLE_OFFLOAD=1 ..
With old version of CMake (<3.11), XL is identified as Clang. The following workaround solves the issue
cmake -DCMAKE_CXX_COMPILER=xlC_r -DCMAKE_CXX_COMPILER_ID='XL' -DENABLE_OFFLOAD=1 ..
cmake -D CMAKE_CXX_COMPILER=clang++ -D ENABLE_OFFLOAD=1 -D USE_OBJECT_TARGET=ON ..
-D USE_OBJECT_TARGET=ON
is used to workaround static linking issue.
cmake -D CMAKE_CXX_COMPILER=icpx -D ENABLE_OFFLOAD=1 -D OFFLOAD_TARGET="spir64" ..
Since beta08, -D USE_OBJECT_TARGET=ON
is not needed any more.
cmake -D CMAKE_CXX_COMPILER=clang++ \
-D ENABLE_OFFLOAD=1 \
-D OFFLOAD_TARGET=amdgcn-amd-amdhsa \
-D OFFLOAD_ARCH=gfx906 ..
cmake -D CMAKE_CXX_COMPILER=g++ -D ENABLE_OFFLOAD=1 ..
module load craype-accel-nvidia60
cmake -D CMAKE_CXX_COMPILER=CC -DQMC_MPI=1 ..
Not yet supported.
Compiler | Clang 11 | AOMP 11.8-0 | XL 16.1.1-5 | Cray 9.0 | GCC 9.2 | GCC 10 |
---|---|---|---|---|---|---|
device | NV | AMD | NV | NV | NV | AMD |
math header conflict | P | P | P | P | P | - |
math linker error | P | P | P | F | P | - |
declare target static data | P | P | P | P | F | - |
static linking | F | P | P | P | F | - |
Async tasking | F | F | P | F | F | - |
multiple stream | P | P | P | F | F | - |
check_spo | P | P | P | P | FL | - |
check_spo_batched | P | P | P | P | FL | - |
miniqmc_sync_move | P | P | P | P | FL | - |
- AOMP 0.7-5 or maybe ROCm has regression and caused linux kernel segfault. https://github.com/ROCm-Developer-Tools/aomp/issues/45
- Cray 9.1 inherits Clang 9 math function issues.
- GNU 9.2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92553
P pass
F fail
FL fail in linking
FR fail in run
FW fail with wrong results
- not tested yet