Skip to content
Alexandra Bozec edited this page Apr 29, 2019 · 2 revisions

OpenMP in HYCOM

The OpenMP directives allow HYCOM to run on multiple processors of shared memory machines. They can also be used in conjuction with MPI domain decomposition (MPI between multi-processor nodes, OpenMP on each node). This mode of parallelization is typically best for a relatively modest number of processors (2,3,4,6,8), although more can profitably be used on large grids.

The PARAMETER mxthrd has been added to mod_dimensions.F90 and dimensions.h. Each outer (i or j) loop is diviuded into mxthrd pieces by OpenMP. So edit mxthrd to be a multiple of NOMP (i.e. OMP_NUM_THREADS). It is often best to set mxthrd larger than NOMP, because that tends to give better land/sea load balance between threads. For example, mxthrd=16 could be used with 2, 4, 8 or 16 threads. Other good choices are 12, 24, 32 etcetera. Large values of mxthrd are only likely to be optimal for

Parallel sums are not performed via an OpenMP REDUCTION clause, because it is not bit for bit reproducable when run twice on the same data sets. Instead, row sums are done explicitly in parellel, followed by a serial sum of the row sums. This is probably slower than REDUCTION, but it is bit for bit reproducable on any number of processors. Please report any cases where HYCOM gives different answers for different values of OMP_NUM_THREADS (this is not supposed to happen).

Clone this wiki locally