Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add timestamp to rpointer files #2757

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

jedwards4b
Copy link
Contributor

@jedwards4b jedwards4b commented Sep 12, 2024

Description of changes

Adds a timestamp to rpointer files in a backward compatible manor

Specific notes

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #):

Are answers expected to change (and if so in what way)?
no
Any User Interface Changes (namelist or namelist defaults changes)?

Does this create a need to change or add documentation? Did you do so?

Submodules updated: Needs at least the first one updated...
cime6.1.47
share1.1.5
cmeps1.0.26

Testing performed, if any: will do regular

Things to do:

  • Get externals and testing working
  • Add a user writeup about this to ChangeLog
  • Fix LILAC
  • Code review changes

@wwieder
Copy link
Contributor

wwieder commented Sep 12, 2024

Thanks Jim. Can this go onto b4bdev, @ekluzek ?

@ekluzek ekluzek self-assigned this Sep 12, 2024
@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability usability Improve or clarify user-facing options labels Sep 12, 2024
@ekluzek
Copy link
Collaborator

ekluzek commented Sep 12, 2024

Thanks @jedwards4b.

@wwieder yes this totally makes sense as something coming into b4b-dev. Since, it has backwards compatibility it doesn't need to be coordinated with other CESM tags or externals. So bringing it into b4b-dev and having it go into CTSM main-dev the next time a b4b-dev tag is made (in two weeks) makes a lot of sense.

@jedwards4b jedwards4b marked this pull request as draft September 19, 2024 20:05
@jedwards4b
Copy link
Contributor Author

I've run into an issue here. The clm_timemgr reads its clock information from the restart file on restart - which makes it hard to read the clock to read the restart file. It's also not a requirement to get this from the restart file as the driver has already set the clock.

@wwieder wwieder added this to the cesm3_0_beta04 milestone Sep 26, 2024
@jedwards4b jedwards4b force-pushed the add_timestamp_to_rpointers branch from 463bf55 to 25efa68 Compare September 26, 2024 20:57
@jedwards4b jedwards4b marked this pull request as ready for review September 26, 2024 20:59
@jedwards4b jedwards4b force-pushed the add_timestamp_to_rpointers branch from 25efa68 to 9f07cf9 Compare September 26, 2024 21:15
@jedwards4b
Copy link
Contributor Author

I have tested with ERS.ne30pg3_t232.BLT1850.derecho_intel.allactive-defaultio
and plan to do a complete set of cesm prealpha tests.

@samsrabin
Copy link
Collaborator

We discussed this at the CTSM SE meeting this morning and decided it would be in our cesm3_0_beta04 tag, which fits with @jedwards4b's timeline.

Update surface datasets, CN Matrix, CLM60: excess ice on, explicit A/C on, crop calendars, Sturm snow, Leung dust emissions, prigent roughness data

Purpose and description of changes since ctsm5.2.005
----------------------------------------------------

Bring in updates needed for the CESM3.0 science capability/functionality "chill". Most importantly bringing
in: CN Matrix to speed up spinup for the BGC model, updated surface datasets, updated Leung 2023 dust emissions,
explicit Air Conditioning for the Urban model, updates to crop calendars. For clm6_0 physics these options are now
default turned on in addition to Sturm snow, and excess ice.

Changes to CTSM Infrastructure:
===============================

 - manage_externals removed and replaced by git-fleximod
 - Ability to handle CAM7 in LND_TUNING_MODE

Changes to CTSM Answers:
========================

 Changes to defaults for clm6_0 physics:
  - Urban explicit A/C turned on
  - Snow thermal conductivity is now Sturm_1997
  - New IC file for f09 1850
  - New crop calendars
  - Dust emissions is now Leung_2023
  - Excess ice is turned on
  - Updates to MEGAN for BVOC's
  - Updates to BGC fire method

 Changes for all physics versions:

  - Parameter files updated
  - FATES parameter file updated
  - Glacier region 1 is now undefined
  - Update in FATES transient Land use
  - Pass active glacier (CISM) runoff directly to river model (MOSART)
  - Add the option for using matrix for Carbon/Nitrogen BGC spinup

New surface datasets:
=====================

- With new surface datasets the following GLC fields have region "1" set to UNSET:
     glacier_region_behavior, glacier_region_melt_behavior, glacier_region_ice_runoff_behavior
- Updates to allow creating transient landuse timeseries files going back to 1700.
- Fix an important bug on soil fields that was there since ctsm5.2.0. This results in mksurfdata_esmf now giving identical answers with a change in number of processors, as it should.
- Add in creation of ne0np4.POLARCAP.ne30x4 surface datasets.
- Add version to the surface datasets.
- Remove the --hires_pft option from mksurfdata_esmf as we don't have the datasets for it.
- Remove VIC fields from surface datasets.

New input datasets to mksurfdata_esmf:
======================================

- Updates in PFT/LAI/soil-color raw datasets (now from the TRENDY2024 timeseries that ends in 2023), as well as two fire datasets (AG fire, peatland), and the glacier behavior dataset.
Same as ctsm5.3.001

I made an accidental merge and reverted it.
@ekluzek
Copy link
Collaborator

ekluzek commented Dec 3, 2024

We are going to do this as a standalone tag to master, so I'll rebase to master.

@ekluzek ekluzek changed the base branch from b4b-dev to master December 4, 2024 16:53
Copy link
Collaborator

@ekluzek ekluzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jedwards4b this is great, thanks for getting this out there for us. There's some nice improvements I saw you add (only reading the rpointer file on masterproc and catching some typos) which is great.

There are some changes that are required, and some I think would be good to do as they should be easy. They are outlined in the code changes. Right now I'm planning on just doing those changes. Feel free to comment on any of it though.

The required change is to move the updates into lnd_comp_esmf.F90 for LILAC.

Thanks again for the PR.

src/main/restFileMod.F90 Show resolved Hide resolved
src/utils/clm_time_manager.F90 Show resolved Hide resolved
src/utils/clm_time_manager.F90 Show resolved Hide resolved
src/cpl/nuopc/lnd_comp_nuopc.F90 Show resolved Hide resolved
@@ -1038,53 +1041,54 @@ subroutine ModelSetRunClock(gcomp, rc)
call ESMF_LogWrite(subname//'setting alarms for ' // trim(name), ESMF_LOGMSG_INFO)

!----------------
! Restart alarm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jedwards4b here the order of setting the stop alarm and then restart alarm was switched. As far as I could see, there isn't a strict need to do this. I figured a preferred order might be stop first and then restart, maybe to be consistent elsewhere.

But, I wanted to make sure I wasn't missing anything. So is this a preferential change or one that's absolutely needed? Thanks in advance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is a requirement to do this. When you request to write the restart at the end of the run, you need to know when the end of run is so that you can set the restart alarm, by initializing the stop alarm first I have the information I need to set the restart alarm.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks for the explanation that helps.

I'll add a comment about this then. And make sure the same is done in LILAC.

! Initialize start date from restart info

start_date = TimeSetymd( rst_start_ymd, rst_start_tod, "start_date" )
! Check start date from restart info
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timemgr_spmdbcast and init_calendar calls above can also be removed, because this now requires timemgr_init to be called first. As such we should check that

timemgr_set == .true.

and abort if not.

! Initialize clock

call init_clock( start_date, ref_date, curr_date)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end also remove

if (masterproc) call timemgr_print()

As it's already done in the timemgr_init step previously. No reason to repeat.


!---------------------------------------------------------------------------------
! Restart the ESMF time manager using the synclock for ending date.
!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this subroutine is now, just to do some checking, set a couple variables, and to do the advance.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 4, 2024

Running aux_clm on Derecho I'm seeing tons of tests passing 199, with only 12 pending, but 23 failing. LILAC fails as I expected, but a bunch of ERI, the SSP tests, and one ERP, a few ERS, one REP, and a few SMS tests fail at the RUN phase.

@jedwards4b
Copy link
Contributor Author

@ekluzek I haven't yet merged the cime PR that you will need for these tests, are you using the branch?

@jedwards4b
Copy link
Contributor Author

I just merged it - try updating to cime6.1.47

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 4, 2024

Ahh, OK, thanks @jedwards4b! I'll update to that and see how it goes.

@ekluzek ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Dec 5, 2024
@samsrabin samsrabin removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Dec 5, 2024
@ekluzek
Copy link
Collaborator

ekluzek commented Dec 9, 2024

@jedwards4b yep, I'm starting up testing now.

By, the way shouldn't the share tagname be share1.1.6 rather than 1.0.21?

@jedwards4b
Copy link
Contributor Author

Good point - I'll fix the share tag name, please use share1.1.6

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 10, 2024

More tests are working on Izumi now. But, several fails with these two issues below...

1.) ERP tests:

But, a bunch of ERP tests fail at the build step with this:

Command: ./case.build --sharedlib-only
Output: WARNING: Found difference in test REST_OPTION: case: ndays original value $STOP_OPTION
 Successfully created new case ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm50Bgc.izumi_nag.clm-ciso.GC.ctsm5214rpointeracl_nag from clone case ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm50Bgc.izumi_nag.clm-ciso.GC.ctsm5214rpointeracl_nag 
Setting resource.RLIMIT_STACK to -1 from (-1, -1)
Setting resource.RLIMIT_STACK to -1 from (-1, -1)
Building test for ERP in directory /scratch/cluster/erik/tests_ctsm5214rpointeracl/ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm50Bgc.izumi_nag.clm-ciso.GC.ctsm5214rpointeracl_nag
WARNING: Test case setup failed. Case2 has been removed, but the main case may be in an inconsistent state. If you want to rerun this test, you should create a new test rather than trying to rerun this one.
Traceback (most recent call last):
  File "./case.build", line 267, in <module>
    _main_func(__doc__)
  File "./case.build", line 226, in _main_func
    test = find_system_test(testname, case)(case)
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/erp.py", line 29, in __init__
    **kwargs
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/restart_tests.py", line 30, in __init__
    **kwargs
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/system_tests_compare_two.py", line 146, in __init__
    self._setup_cases_if_not_yet_done()
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/system_tests_compare_two.py", line 450, in _setup_cases_if_not_yet_done
    self._setup_cases()
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/system_tests_compare_two.py", line 540, in _setup_cases
    self._case_one_setup()
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/restart_tests.py", line 36, in _case_one_setup
    self._set_restart_interval()
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/system_tests_common.py", line 198, in _set_restart_interval
    startdatetime = datetime.fromisoformat(startdate) + timedelta(
AttributeError: type object 'datetime.datetime' has no attribute 'fromisoformat'

 ---------------------------------------------------

However, even though it warns against this I can go into the test and run:

./case.build
./case.submit

and it seems to work fine, even though it warns against doing that in the error message above.

2.) NEON tests:

Also the NEON tests are now ALL failing because it can no longer find the NEON user-mod. It gives the following error:

For example for the test: SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.izumi_nag.clm-NEON-MOAB--clm-PRISM

2024-12-09 18:34:41: Could not locate testmod 'NEON/MOAB'

Those are tests that had been working for a very long time.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 10, 2024

On Derecho I have a ton, of ERI, ERP, SSP, and NEON fails:

ERI_C2_Ld9.f10_f10_mg37.I2000Clm60BgcCrop.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_D.ne30pg3_t232.I1850Clm60BgcCropG.derecho_intel.clm-clm60cam7LndTuningModeLDust	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I1850Clm45Bgc.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I1850Clm60Bgc.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I1850Clm60Bgc.derecho_gnu.clm-default--clm-matrixcnOn	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERI_D_Ld9.ne30_g17.I2000Clm50BgcCru.derecho_intel.clm-vrtlay	(NLCOMP RUN)		
ERI_D_Ld9.ne30_g17.I2000Clm50BgcCru.derecho_intel.clm-vrtlay--clm-matrixcnOn	(NLCOMP RUN)		
ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-drydepnomegan	(NLCOMP RUN)		
ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERI_Ld9.f45_g37.I2000Clm50BgcCru.derecho_intel.clm-nofire	(NLCOMP RUN)		
ERI_Ld9.f45_g37.I2000Clm50BgcCru.derecho_intel.clm-nofire--clm-matrixcnOn	(NLCOMP RUN)		
ERP_D.f10_f10_mg37.IHistClm60Bgc.derecho_gnu.clm-decStart	(NLCOMP RUN)		
ERP_D.f10_f10_mg37.IHistClm60Bgc.derecho_gnu.clm-decStart--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-decStart	(NLCOMP RUN)		
ERP_D_Ld10.f10_f10_mg37.I1850Clm60BgcCrop.derecho_intel.clm-ADspinup	(NLCOMP RUN)		
ERP_D_Ld10_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-ciso_decStart	(NLCOMP RUN)		
ERP_D_Ld10_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-ciso_decStart--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_Ld10_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_Ld10_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-default--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_Ld3_P64x2.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_D_Ld3_P64x2.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_Ld3_PS.f09_g17.I2000Clm50Sp.derecho_intel.clm-prescribed	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_gnu.clm-drydepnomegan	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I1850Clm50BgcCropG.derecho_gnu.clm-glcMEC_changeFlags	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-ciso_flexCN_FUN	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-ciso_flexCN_FUN--clm-matrixcnOn	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-fire_emis	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-anoxia	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-anoxia--clm-matrixcnOn	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50Sp.derecho_gnu.clm-reduceOutput	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50Sp.derecho_intel.clm-reduceOutput	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm60Sp.derecho_intel.clm-decStart	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm45Sp.derecho_intel.clm-decStart	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm50BgcCrop.derecho_intel.clm-allActive	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm50BgcCrop.derecho_intel.clm-allActive--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm50SpCru.derecho_gnu.clm-drydepnomegan--clm-nofireemis	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm60Sp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_Ld5.ne30pg3_t232.IHistClm60Sp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_Ld9.ne30pg3_t232.I1850Clm60BgcCropG.derecho_intel.clm-clm60cam6LndTuningMode	(NLCOMP RUN)		
ERP_D_Ld9.ne30pg3_t232.I1850Clm60BgcCropG.derecho_intel.clm-clm60cam7LndTuningModeLDust	(NLCOMP RUN)		
ERP_D_Ld9.ne30pg3_t232.IHistClm60BgcCropG.derecho_intel.clm-clm60cam7LndTuningModeLDust	(NLCOMP RUN)		
ERP_D_P128x1_Ld26.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-crop--clm-midDecStart--clm-RxCropCalsAdaptGGCMI	(NLCOMP RUN)		
ERP_D_P64x2_Ld10.f10_f10_mg37.I2000Clm60Bgc.derecho_intel.clm-Hillslope	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-default--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm60BgcCrop.derecho_gnu.clm-mimics	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm45BgcCrop.derecho_gnu.clm-no_subgrid_fluxes	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-snowveg_norad	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-cn_conly	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-flexCN_FUN	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-flexCN_FUN--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-luna	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-noFUN_flexCN	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-noFUN_flexCN--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-coldStart	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-coldStart--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_P64x2_Ld30.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld5.f10_f10_mg37.I2000Clm50BgcCropRtm.derecho_intel.clm-irrig_spunup	(NLCOMP RUN)		
ERP_D_P64x2_Ld5.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-irrig_spunup	(NLCOMP RUN)		
ERP_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_Ld9.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdAllVars	(NLCOMP RUN)		
ERP_Ld9.f45_g37.I2000Clm60Bgc.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_Ly3_P64x2.f10_f10_mg37.IHistClm50BgcCrop.derecho_intel.clm-cropMonthOutput	(NLCOMP RUN)		
ERP_Ly3_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_P128x2_Ld30.f45_f45_mg37.I2000Clm60FatesSpCruRsGs.derecho_intel.clm-FatesColdSatPhen	(NLCOMP RUN)		
ERP_P256x2_D_Ld5.f19_g17_gris4.I1850Clm50BgcCropG.derecho_intel.clm-glcMEC_increase	(NLCOMP RUN)		
ERP_P256x2_Ld30.f45_f45_mg37.I2000Clm60FatesRs.derecho_intel.clm-mimicsFatesCold	(NLCOMP RUN)		EXPECTED (RUN)
ERP_P64x2_D.f10_f10_mg37.I2000Clm50SpRtmFl.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld10.f10_f10_mg37.IHistClm50SpG.derecho_intel.clm-glcMEC_decrease--clm-nofireemis	(NLCOMP RUN)		
ERP_P64x2_D_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-extra_outputs	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm45BgcCrop.derecho_intel.clm-crop	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm45BgcCru.derecho_intel.clm-ciso	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm45BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-ciso	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-ciso--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Clm45Sp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Clm50Sp.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Ctsm50NwpBgcCropGswp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Ctsm50NwpSpGswp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.IHistClm45BgcCru.derecho_intel.clm-decStart	(NLCOMP RUN)		
ERP_P64x2_Ld1096.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-clm50cropIrrigMonth_interp	(NLCOMP RUN)		
ERP_P64x2_Ld1096.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-irrig_o3falk_reduceOutput	(NLCOMP RUN)		
ERP_P64x2_Ld366.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-irrig_alternate_monthly	(NLCOMP RUN)		
ERP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_gnu.clm-monthly	(NLCOMP RUN)		
ERP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly	(NLCOMP RUN)		
ERP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_P64x2_Ld762.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-monthly	(NLCOMP RUN)		
ERS_D_Ld20.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdTwoStream	(NLCOMP COMPARE_base_rest)		EXPECTED (COMPARE_base_rest)
ERS_D_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-default	(NLCOMP RUN)		
ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm60SpRs.derecho_gnu.clm-CLM1PTStartDate	(NLCOMP RUN)		
ERS_D_Ld6.f10_f10_mg37.I1850Clm45BgcCrop.derecho_gnu.clm-clm50CMIP6frc	(NLCOMP RUN)		
ERS_D_Mmpi-serial_Ld5.1x1_brazil.I2000Clm50FatesRs.derecho_gnu.clm-FatesCold	(NLCOMP RUN)		
ERS_D_Mmpi-serial_Ld5.5x5_amazon.I2000Clm50FatesRs.derecho_gnu.clm-FatesCold	(NLCOMP RUN)		
ERS_L761.1x1_smallvilleIA.IHistClm50BgcCropQianRs.derecho_gnu.clm-smallville_dynurban_monthly	(XML)		
ERS_Ld3_D.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-rad_hrly_light_res_half	(NLCOMP RUN)		
ERS_Ld765.1x1_smallvilleIA.IHistClm50BgcCropQianRs.derecho_gnu.clm-smallville_dynlakes_monthly	(NLCOMP RUN)		
ERS_P128x1_Ld762.f10_f10_mg37.I2000Clm60Fates.derecho_intel.clm-FatesColdNoComp	(NLCOMP RUN)		
LILACSMOKE_D_Ld2.f10_f10_mg37.I2000Ctsm50NwpSpAsRs.derecho_intel.clm-lilac	(NLCOMP MODEL_BUILD)		
REP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly--clm-matrixcnOn_ignore_warnings	(NLCOMP COMPARE_base_rep2 BASELINE)		
SMS_D.f10_f10_mg37.I2000Clm60BgcCrop.derecho_nvhpc.clm-crop	(SHAREDLIB_BUILD NLCOMP)		EXPECTED (RUN)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.derecho_gnu.clm-NEON-MOAB--clm-PRISM	(CREATE_NEWCASE)		EXPECTED (SHAREDLIB_BUILD RUN)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.derecho_gnu.clm-default--clm-NEON-HARV	(CREATE_NEWCASE)		EXPECTED (SHAREDLIB_BUILD)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.derecho_gnu.clm-default--clm-NEON-HARV--clm-matrixcnOn	(CREATE_NEWCASE)		
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Fates.derecho_gnu.clm-FatesPRISM--clm-NEON-FATES-YELL	(CREATE_NEWCASE)		EXPECTED (SHAREDLIB_BUILD RUN)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Fates.derecho_intel.clm-FatesFireLightningPopDens--clm-NEON-FATES-NIWO	(CREATE_NEWCASE)		EXPECTED (SHAREDLIB_BUILD)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60SpRs.derecho_intel.clm-default--clm-NEON-TOOL	(CREATE_NEWCASE)		
SSP_D_Ld10.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-rtmColdSSP	(NLCOMP RUN)		
SSP_D_Ld4.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-ciso_rtmColdSSP	(NLCOMP RUN)		
SSP_Ld10.f10_f10_mg37.I1850Clm50Bgc.derecho_gnu.clm-rtmColdSSP	(NLCOMP RUN)

Specific problems I looked at...

1.) ERS test for a specific year:

ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm60SpRs.derecho_gnu.clm-CLM1PTStartDate

atm.log:

(shr_strdata_readstrm) reading file lb: /glade/campaign/cesm/cesmdata/inputdata/atm/datm7/topo_forcing/topodata_0.9x1.25_USGS_070110_stream_c151201.nc       1
(shr_strdata_readstrm) reading file ub: /glade/campaign/cesm/cesmdata/inputdata/atm/datm7/topo_forcing/topodata_0.9x1.25_USGS_070110_stream_c151201.nc       1
 (datm_datamode_clmncep_advance): tbotmax =    290.55999755859375     
 (datm_datamode_clmncep_advance): anidrmax =    1.0000000000000000E+030
 atm : model date     19931205           0
 ERROR: shr_get_rpointer_nameERROR no rpointer file found in rpointer.cpl                                                                                                                                                                                                                                                     or in rpointer.cpl

The dated rpointer.cpl files ARE on disk though...

(ctsm_pylib) tests_ctsm5314rpointeracl/ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm60SpRs.derecho_gnu.clm-CLM1PTStartDate.GC.ctsm5314rpointeracl_gnu> cat run/rpointer.
rpointer.atm                   rpointer.cpl.1993-12-05-00000  rpointer.cpl.1993-12-07-00000  rpointer.lnd.1993-12-02-00000  rpointer.lnd.1993-12-05-00000  rpointer.lnd.1993-12-07-00000

cesm.log:

 (t_initf)       profile_outpe_num=                  1
 (t_initf)       profile_outpe_stride=               0
 (t_initf)       profile_single_file=      F
 (t_initf)       profile_global_stats=     T
 (t_initf)       profile_ovhd_measurement= F
 (t_initf)       profile_add_detail=       F
 (t_initf)       profile_papi_enable=      F
 ERROR: shr_get_rpointer_nameERROR no rpointer file found in rpointer.cpl                                                                                                                                                                                                                                                     or in rpointer.cpl
#0  0x125b1cf in __shr_abort_mod_MOD_shr_abort_backtrace
	at /glade/work/erik/ctsm_worktrees/quickfix/share/src/shr_abort_mod.F90:104
#1  0x125b292 in __shr_abort_mod_MOD_shr_abort_abort
	at /glade/work/erik/ctsm_worktrees/quickfix/share/src/shr_abort_mod.F90:61
#2  0x1255a3a in __nuopc_shr_methods_MOD_shr_get_rpointer_name
	at /glade/work/erik/ctsm_worktrees/quickfix/share/src/nuopc_shr_methods.F90:849
#3  0x55f78f in __med_phases_restart_mod_MOD_med_phases_restart_read
	at /glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../mediator/med_phases_restart_mod.F90:545
#4  0x46ab0b in datainitialize
	at /glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../mediator/med.F90:2191

med.log, drv.log, and lnd.log all look fine and don't report errors.

However, the drv.log does seem to have the right year, as follows. So possibly there's a missing broadcast of year?

drv.log:

  read rpointer file = rpointer.cpl.1993-12-05-00000
(/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit) reading driver restart from file = ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm60SpRs.derecho_gnu.clm-CLM1PTStartDate.GC.ctsm5314rpointeracl_gnu.cpl.r.1993-12-05-00000.nc
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver start_ymd:   19931202
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver start_tod:          0
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver curr_ymd:   19931205
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver curr_tod:          0
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver time interval is :       3600

2.) ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default

dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf) Read in prof_inparm namelist from: drv_in
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf) Using profile_disable=          F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_timer=                      4
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_depth_limit=               12
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_detail_limit=               2
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_barrier=          F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_outpe_num=                  1
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_outpe_stride=               0
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_single_file=      F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_global_stats=     T
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_ovhd_measurement= F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_add_detail=       F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_papi_enable=      F
dec2324.hsn.de.hpc.ucar.edu 0:  ESMF_Finalize: Error closing trace stream
dec2324.hsn.de.hpc.ucar.edu 0: MPICH ERROR [Rank 0] [job id 889e4ccc-0d6a-437e-87c3-408c149f1bf9] [Mon Dec  9 19:36:23 2024] [dec2324] - Abort(1) (rank 0 in comm 496): application called MPI_Abort(comm=0x84000002, 1) - process 0
dec2324.hsn.de.hpc.ucar.edu 0: 
dec2324.hsn.de.hpc.ucar.edu 0: forrtl: severe (174): SIGSEGV, segmentation fault occurred
dec2324.hsn.de.hpc.ucar.edu 0: Image              PC                Routine            Line        Source             
dec2324.hsn.de.hpc.ucar.edu 0: libpthread-2.31.s  0000150B6470E8C0  Unknown               Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libmpi_intel.so.1  0000150B626CDE7E  Unknown               Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libmpi_intel.so.1  0000150B624DC22F  Unknown               Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libmpi_intel.so.1  0000150B60B096A8  MPI_Abort             Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6D823E82  abort                     863  ESMCI_VMKernel.C
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6D81DD03  abort                    3634  ESMCI_VM.C
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6D848431  c_esmc_vmabort_          1252  ESMCI_VM_F.C
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6EDC3D87  esmf_vmmod_mp_esm        9521  ESMF_VM.F90
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6E8C6A58  esmf_initmod_mp_e        1684  ESMF_Init.F90
dec2324.hsn.de.hpc.ucar.edu 0: cesm.exe           0000000000449347  MAIN__                    132  esmApp.F90
dec2324.hsn.de.hpc.ucar.edu 0: cesm.exe           0000000000421A3D  Unknown               Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libc-2.31.so       0000150B6000029D  __libc_start_main     Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: cesm.exe           000000000042196A  Unknown               Unknown  Unknown

3.) ERP_D_Ld5.f10_f10_mg37.IHistClm60Sp.derecho_intel.clm-default

Failure looks similar to above

4.) SMS_D.f10_f10_mg37.I2000Clm60BgcCrop.derecho_nvhpc.clm-crop

Fails in build -- looks like it's a CTSM issue that I'll work on.

5.) NEON tests are probably the same as on Izumi

6.) SSP_D_Ld10.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-rtmColdSSP

This is a CTSM specific test type for doing a spinup.

It fails on submit with the following:

Submitting job script qsub -q main -l walltime=00:20:00 -A P93300606 -l job_priority=regular -v ARGS_FOR_SCRIPT='--skip-preview-namelist' /glade/derecho/scratch/erik/tests_ctsm5314rpointeracl/SSP_D_Ld10.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-rtmColdSSP.GC.ctsm5314rpointeracl_int/.case.test
Submitted job id is 7129803.desched1
Submitted job case.test with id 7129803.desched1
submit_jobs case.test
Submit job case.test

 ---------------------------------------------------
2024-12-09 20:37:25: ERROR: Cannot modify case, read_only. Case must be opened with read_only=False and can only be modified within a context manager
 ---------------------------------------------------

@jedwards4b
Copy link
Contributor Author

@ekluzek @billsacks This is due to an inconsistency in the way we name test mods and user mods. For testmod_dirs we require the component name in the path, for example:
CTSM/cime_config/testdefs/testmods_dirs/clm/PRISM but for user mods directories we do not:
CTSM/cime_config/usermods_dirs/NEON/MOAB.
Jason Boutte refactored this function in commit bf19cab32f984a5f2256b53af05f0ab63232bd95
which was merged in tag cime6.0.246.

I think maybe the easiest solution is to add the component name in the user mods dir (or remove it from the test mods dir) to be consistant in the naming convention. I have tested this by moving the directories in /home/jedwards/CTSM/cime_config/usermods_dirs/ to /home/jedwards/CTSM/cime_config/usermods_dirs/clm. And have confirmed that this solves the problem. What do you think of this solution?

@jedwards4b
Copy link
Contributor Author

@ekluzek your sandbox in /glade/work/erik/ctsm_worktrees/quickfix has not been updated to the latest tags.

@billsacks
Copy link
Member

This is due to an inconsistency in the way we name test mods and user mods. For testmod_dirs we require the component name in the path, for example: CTSM/cime_config/testdefs/testmods_dirs/clm/PRISM but for user mods directories we do not: CTSM/cime_config/usermods_dirs/NEON/MOAB. Jason Boutte refactored this function in commit bf19cab32f984a5f2256b53af05f0ab63232bd95 which was merged in tag cime6.0.246.

I think maybe the easiest solution is to add the component name in the user mods dir (or remove it from the test mods dir) to be consistant in the naming convention. I have tested this by moving the directories in /home/jedwards/CTSM/cime_config/usermods_dirs/ to /home/jedwards/CTSM/cime_config/usermods_dirs/clm. And have confirmed that this solves the problem. What do you think of this solution?

Thanks, @jedwards4b ! I think what you're saying is that, for a directory in the usermods_dirs space to be picked up via a test name, it would need to fall under a clm subdirectory. Just wanting to make sure that the issue is limited to picking up usermods via test names and isn't more general. If so, I support your simple suggestion for a fix if it's okay with @ekluzek .

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 10, 2024

@jedwards4b ahh, you are right, I didn't pull the latest update that I had pushed. Thanks for the correction. Resending those tests.

@jedwards4b
Copy link
Contributor Author

Once I changed the user mods path I got a ton of compiler errors from nag using test SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.izumi_nag.clm-NEON-MOAB--clm-PRISM. Changing the compiler to gnu allowed everything to pass.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 10, 2024

@ekluzek @billsacks This is due to an inconsistency in the way we name test mods and user mods. For testmod_dirs we require the component name in the path, for example: CTSM/cime_config/testdefs/testmods_dirs/clm/PRISM but for user mods directories we do not: CTSM/cime_config/usermods_dirs/NEON/MOAB. Jason Boutte refactored this function in commit bf19cab32f984a5f2256b53af05f0ab63232bd95 which was merged in tag cime6.0.246.

I think maybe the easiest solution is to add the component name in the user mods dir (or remove it from the test mods dir) to be consistant in the naming convention. I have tested this by moving the directories in /home/jedwards/CTSM/cime_config/usermods_dirs/ to /home/jedwards/CTSM/cime_config/usermods_dirs/clm. And have confirmed that this solves the problem. What do you think of this solution?

@jedwards4b thanks for pointing this out. Three comments here...

Fist, I largely agree with @billsacks here that this is the best solution moving forward. Having "clm" in the name for BOTH testmods and usermods means you can know which component has that mod directory. And it also keeps their them consistent which is good as well.

Second, the catch is that this will change behavior for the NEON and PLUMBER2 folks, and I want to run it by them first. I think we can convince them it's OK though.

Third, looking at that cime commit it was in cime6.1.39, which was just after our latest CTSM tag ctsm5.3.014 which used cime6.1.37. Otherwise, we should have seen it sooner, as our previous tags were at cime6.0.246 as well.

bf19cab32f984a5f2256b53af05f0ab63232bd95

So I'll check with our peeps and make sure this is OK, but we'll plan on that solution...

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 12, 2024

I have the NEON system tests working now. I'm struggling with a problem in run_neon though...

  File "/glade/work/erik/ctsm_worktrees/quickfix/cime/CIME/nmlgen.py", line 344, in get_default
    name, attributes=config, exact_match=False
  File "/glade/work/erik/ctsm_worktrees/quickfix/cime/CIME/XML/namelist_definition.py", line 222, in get_value_match
    replacement_for_none="",
  File "/glade/work/erik/ctsm_worktrees/quickfix/cime/CIME/XML/entry_id.py", line 75, in get_value_match
    replacement_for_none=replacement_for_none,
  File "/glade/work/erik/ctsm_worktrees/quickfix/cime/CIME/XML/entry_id.py", line 131, in _get_value_match
    self.get(vnode, attribute), attributes[attribute]
  File "/glade/work/erik/conda-envs/ctsm_pylib/lib/python3.7/re.py", line 185, in search
    return _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object

And there are still a few system tests that fail unexpectedly.

ERS_L761.1x1_smallvilleIA.IHistClm50BgcCropQianRs.derecho_gnu.clm-smallville_dynurban_monthly (XML)
ERS_P128x1_Ld762.f10_f10_mg37.I2000Clm60Fates.derecho_intel.clm-FatesColdNoComp (NLCOMP RUN)
SSP_D_Ld10.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-rtmColdSSP (NLCOMP RUN)
SSP_D_Ld4.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-ciso_rtmColdSSP (NLCOMP RUN)
SSP_Ld10.f10_f10_mg37.I1850Clm50Bgc.derecho_gnu.clm-rtmColdSSP (NLCOMP RUN)

On Izumi I have the problem that ERP tests seem to need to be built by hand before submitting. I'll make sure that's the case for the rest of them.

@jedwards4b
Copy link
Contributor Author

It looks as if you are trying to match an empty string in the NEON case - I can have a look if you give me instructions to reproduce. Also how do you reproduce the Izumi problem? Looking into the SSP issue...

@jedwards4b
Copy link
Contributor Author

Looking in particular at the test SSP_D_Ld4.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-ciso_rtmColdSSP. I see the COMPSET is 1850_DATM%GSWP3v1_CLM50%BGC-CROP_SICE_SOCN_MOSART_SGLC_SWAV_SESP which includes an active RTM. But the nuopc generated run sequence does not include this component.
Writing nuopc_runconfig for components ['CPL', 'ATM', 'LND']
I have confirmed that this is also the behavior in alpha05a and in all previous testing in /glade/derecho/scratch/erik (using command find tests*/SSP_D_Ld4.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-ciso_rtmColdSSP* -name TestStatus.log -exec grep nuopc_runconfig {} \; -print)

So I think that this test has a fundamental problem present since the inception of nuopc.

@jedwards4b
Copy link
Contributor Author

I see now that MOSART_MODE=NULL so this seems to be by design.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 12, 2024

Hey Jim. Thanks for taking a look.

On the run_neon fail. You can do test this with...

cd python
./run_sys_tests --sys

And the error shows up pretty quick. I'm going to be working on this one, and also get some help with @slevis-lmwg

I thought it might be the cime update, but that doesn't seem to be the problem...

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 12, 2024

On the SSP test issue, these were passing in ctsm5.3.014 so this a new fail. And yes, it's turning off the ROF model by design, so NUOPC is functioning correctly there.

The Izumi problem came up for my by just sending in the tests, and then after the ERP tests fail -- I had to go into them and run: ./case.build; ./case.submit by hand. After that it worked. I'm going to verify that gets it to work for ALL the tests.

I assume it will fail similarily just sending a single test (for example: ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm50Bgc.izumi_nag.clm-ciso), but I haven't verified that yet. But, I will.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 12, 2024

We talked about this PR in our CTSM SE meeting this morning, and we need to get this tag in ASAP for CESM, so we may have to bring it in with a few glitches. I should bring it in by tomorrow even if there are a few issues still outstanding.

@jedwards4b @wwieder @briandobbins @fischer-ncar @cacraigucar

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 12, 2024

OK the run_neon problem is actually important -- as it's showing a problem for cases when create_newcase is used, rather than only under create_test. Cases will fail at preview_namelist because of an update in CDEPS that expects the TESTCASE attribute to be available in the case. It's always there for testcases, but it's missing when just create_newcase is used.

The simplest fix might be to move it into env_run.xml (from env_test.xml) in cime so that it always exists and can be relied upon.

@jedwards4b
Copy link
Contributor Author

@ekluzek To https://github.com/jedwards4b/CMEPS.git

  • [new branch] testcase_unset -> testcase_unset

Please confirm this works and I'll make a new cmeps tag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new capability or improved behavior of existing capability usability Improve or clarify user-facing options
Projects
Status: In progress - master/b4b-dev
Status: In Progress
Development

Successfully merging this pull request may close these issues.

5 participants