-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use MPI_Bcast instead of multiple p2p messages to update nest from parent #2059
Use MPI_Bcast instead of multiple p2p messages to update nest from parent #2059
Conversation
…pl by a single MPI_Bcast
…pl by a single MPI_Bcast
… structures associated with Bcast_comm
…epends on NOAA-GFDL/FMS#1246 to be functional. More efficient 'if' test in fill_nested_grid_cpl()
…weather-model into hafs-BcastFillNestedGridCpl Keeping pace with mainline
…fficient determination of ranks involved in the mpp_broadcast call in fill_nested_grid_cpl routine
@dkokron you'll want to make sure to pull from ufs-community/ufs-weather-model develop branch and then update your submodule hashes. This includes in your FV3 PR as well bring the updates into UFSWM. |
@dkokron I think branches got synced ok. Can you check files changed? |
Skipping WCOSS2 as it's down today. Will be back at 20Z if PR gets delayed past that. |
@dkokron @jkbk2004 fv3atm is merged. Please update hash and revert gitmodule url. |
PR Author Checklist:
I have linked PR's from all sub-components involved in section below.
I am confirming reviews are completed in ALL sub-component PR's.
I have run the full RT suite on either Hera/Hercules AND have attached the log to this PR below this line:
RegressionTests_hercules.log
I have added the list of all failed regression tests to "Anticipated changes" section.
I have filled out all sections of the template.
Performance profiling of a HAFS case on NOAA systems revealed significant of time was spent in fill_nested_grid_cpl(). The fill_nested_grid_cpl() routine from FV3/atmos_cubed_sphere/driver/fvGFS/atmosphere.F90 is showing up as a performance bottleneck. This routine gathers a global SST field (6,336,000 bytes) onto rank 0, then broadcasts that field to all ranks in the nest. The code uses point-to-point (p2p) messages (Isend/Recv) from rank 0 to the nest ranks. This communication pattern is maxing out the SlingShot-10 link on the first node resulting in a .15s hit every fifth time step.
The proposed fix is to modify the relevant FV3 code to use a single MPI_Bcast (via mpp_broadcast()) instead of multiple point-to-point messages (via mpp_send/mpp_recv). The use of mpp_broadcast depends on a fix to FMS that was merged on 16 June and is available in version 2023.02 of that package.
This PR depends on
This PR also depends on
I ran the UFS regression suite on acorn and cactus. Both runs resulted in "REGRESSION TEST WAS SUCCESSFUL". I do not have access to Hera/Hercules.
This change is zero-diff. No need to update baselines.
Commit Message
Subcomponents involved:
Anticipated Changes
Input data
Regression Tests:
Tests effected by changes in this PR:
Libraries
Code Managers Log
Testing Log: