Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable cmeps to use PIO+PNETCDF for IO in UFS #2347

Open
DeniseWorthen opened this issue Jun 28, 2024 · 13 comments
Open

Enable cmeps to use PIO+PNETCDF for IO in UFS #2347

DeniseWorthen opened this issue Jun 28, 2024 · 13 comments
Assignees
Labels
enhancement New feature or request

Comments

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Jun 28, 2024

Description

Currently CMEPS in UFS does not make use of PIO options. Restart (and history) writing is through serial netcdf. CMEPS has an existing capability to write using PIO+pnetcdf, with control of the various PIO options (eg. stride, numiotasks) through configuration.

Solution

Parallel writes for CMEPS should be implemented in UFS through setting the appropriate PIO config options. Scalability testing should be done to determine correct values for the PIO settings.

Alternatives

Related to

See oceanmodeling/CMEPS#1 for an example of this issue arising in the coastal modeling effort.

@uturuncoglu
Copy link
Collaborator

@DeniseWorthen I think you mean that the PIO options needs to be added to ufs template files. Right? I just want to clarify. The capability to use different options for PIO is already implemented in CMEPS and CDEPS.

@DeniseWorthen
Copy link
Collaborator Author

Yes, exactly. I will clarify the issue description.

@DeniseWorthen DeniseWorthen changed the title enable cmeps IO using pnetcdf Enable cmeps to use PIO+PNETCDF for IO in UFS Jun 28, 2024
@DeniseWorthen
Copy link
Collaborator Author

I set up an ATM-OCN-ICE case (C384, 1/4deg) on Gaea-C5. I turned off all history and restart-writing except for CMEPS. To do this for OCN and ICE, I manually over-rode the write-restart logicals in the codes and set them false prior to compiling. I removed the WGC for the ATM and used a layout of 16x24 and did not use threading for the ATM. This gave me 2304 PEs as a max for CMEPS. I made a series of 24 hour runs, with mediator restarts at 3 hour intervals, giving a total of 8 mediator restart writes. I recorded the min/max and mean times for the med_phase_restart_write in the ESMF Profile Summary log.

Using the config variables in ufs.configure, I did 3 sets of runs using 300,600,1200 or 2300 PEs for CMEPS. I set the pio_type to pnetcdf for all runs. One set of runs allowed CMEPS to set all the PIO associated parameters, one set I manually set the numio tasks to yield a stride=4 and a final set I set both numio and stride according to whether the PE count was > or < 1000 (see med_io_mod).

For the existing configuration, serial netcdf is used by default. This provides a mean write time for each CMEPS restart of ~2.4s. Using pnetcdf+PIO, best results were found using the subset rearranger at stride=4. Depending on the number of tasks, this results in each CMEPS restart time between ~0.8 and 0.5s for each restart write. See full results here

@junwang-noaa
Copy link
Collaborator

Denise, thanks for testing the new parallel writing in CMEPS, the speedup is great (>60%). It might be good to test the feature in higher resolution runs (C768 and C1152). I recall we have problems to use a large number of tasks for CMEPS.

@DeniseWorthen
Copy link
Collaborator Author

@junwang-noaa I could test the higher ATM cases, all I need is the ATM input and the layouts to try.

@junwang-noaa
Copy link
Collaborator

@DusanJovic-NOAA do you have C768/C1112 ATM only test cases (run directories) generated from G-W?

@DusanJovic-NOAA
Copy link
Collaborator

@DusanJovic-NOAA do you have C768/C1112 ATM only test cases (run directories) generated from G-W?

I have them on wcoss2 here:

/lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/c1152_gw_case/
/lfs/h2/emc/eib/noscrub/dusan.jovic/ufs/c768_gw_case/

@DeniseWorthen
Copy link
Collaborator Author

I've grabbed these now and will set up some more testing for CMEPS PIO options. It looks like in these were used to test blocksize changes. I'm assuming I should stick w/ the blocksize=32 settings, right?

@DusanJovic-NOAA
Copy link
Collaborator

I've grabbed these now and will set up some more testing for CMEPS PIO options. It looks like in these were used to test blocksize changes. I'm assuming I should stick w/ the blocksize=32 settings, right?

Yes.

@DeniseWorthen
Copy link
Collaborator Author

Nothing is moving on Gaea today, but I've been testing adding the config variables to the RT templates. On hercules, it appears that for small PE counts, like in the cpld_control test (CMEPS=144 PEs), using serial netcdf is actually faster than pnetcdf. So I plan on doing some more tests on Gaea at the C384 resolution also using fewer and fewer CMEPS PEs, to see if I can identify the point at which pnetcdf starts to pay off.

@DeniseWorthen
Copy link
Collaborator Author

DeniseWorthen commented Jul 10, 2024

I've been able to get the c768 ATM only case running on Gaea but it is failing at about hour 21. See /gpfs/f5/nggps_emc/scratch/Denise.Worthen/cmepspio768/test.atmonly

I'm not sure why it's failing. I compiled on gaea and used the job-card from the low-res RT case, modifying for the task count. All the fix files are pointing to G-W fix file locations on Gaea. I'm seeing

1303: forrtl: error (78): process killed (SIGTERM)
1303: Image              PC                Routine            Line        Source
1303: libpthread-2.31.s  00007F842F290910  Unknown               Unknown  Unknown
1303: libpthread-2.31.s  00007F842F28B70C  pthread_cond_wait     Unknown  Unknown
1303: fv3.exe            0000000000C8A9B4  Unknown               Unknown  Unknown
1303: fv3.exe            0000000000C8BC29  Unknown               Unknown  Unknown
1303: fv3.exe            0000000000F70450  Unknown               Unknown  Unknown
1303: fv3.exe            00000000009FEEFE  Unknown               Unknown  Unknown
1303: fv3.exe            000000000071E971  Unknown               Unknown  Unknown
1303: fv3.exe            0000000001AC3A12  fv3atm_cap_mod_mp        1077  fv3_cap.F90
1303: fv3.exe            0000000001AC346B  fv3atm_cap_mod_mp        1026  fv3_cap.F90
1303: fv3.exe            0000000000CF36A8  Unknown               Unknown  Unknown

EDIT: Now I see that it was a time-out.

@junwang-noaa
Copy link
Collaborator

@DeniseWorthen Can you confirm that the c768 ATM test still failed on gaea? Can you list the changes to turn on PIO_Pnetcdf in CMPES so that it can be tested on wcoss2?

@DeniseWorthen
Copy link
Collaborator Author

@junwang-noaa I haven't tried the c768 case recently. What I really need is a canned case for the coupled model that runs on Gaea---I was trying to modify the standalone case.

To turn on PnetCDF for CMEPS, add to the ufs.configure in the MED_attributes.

MED_attributes::
....
      pio_rearranger = subset
      pio_typename = pnetcdf
      pio_stride = 4
....

This will create as many io tasks as possible, assuming they are laid out at a stride of 4 across the available processors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

4 participants