Coarse grained OpenMP and running hybrid MPI+OpenMP in AMReX #2336

ashwathsv · 2021-09-17T22:53:45Z

ashwathsv
Sep 17, 2021

Hi, how does the coarse-grained OpenMP parallelism work in AMReX? This is with reference to the GPU/CNS tutorial where Make.CNS defines AMReX as using a coarse-grained OpenMP approach.

From the documentation, I understand that when compiling with MPI+OpenMP, each MPI rank/processor gets its own DistributionMapping (which may have multiple MFIter loops) and OpenMP threads are spawned on each MPI rank to run MFIter loops in parallel. This appears to be fine-grained parallelism on OpenMP.

How is a coarse-grained OpenMP approach (by setting 'DEFINES += -DAMREX_CRSEGRNDOMP' ) turn out to be different from the fine-grained approach? Further, what would be the best approach to achieve maximum compute performance for a hybrid MPI+OpenMP code?

I would really appreciate your thoughts on this. Thanks in advance

WeiqunZhang · 2021-09-18T22:27:48Z

WeiqunZhang
Sep 18, 2021
Maintainer

Our OpenMP strategy is explained in this paper, https://arxiv.org/abs/1604.03570. As for the AMREX_CRSEGRNDOMP macro, it was added to avoid nested OpenMP parallel regions when we started using the OpenMP threading over tiles approach. Currently, it's only used in a few places in the AmrLevel and StateData classes, where user provided functions are called. At the time, users' functions often contained omp parallel regions. But after so many years, I think it's time for us to finally remove AMREX_CRSEGRNDOMP entirely.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coarse grained OpenMP and running hybrid MPI+OpenMP in AMReX #2336

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Coarse grained OpenMP and running hybrid MPI+OpenMP in AMReX #2336

ashwathsv Sep 17, 2021

Replies: 1 comment

WeiqunZhang Sep 18, 2021 Maintainer

ashwathsv
Sep 17, 2021

WeiqunZhang
Sep 18, 2021
Maintainer