Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add module file to compile on AWS #742

Merged
merged 4 commits into from
May 7, 2024

Conversation

weihuang-jedi
Copy link
Contributor

@weihuang-jedi weihuang-jedi commented Apr 29, 2024

Add module file to compile on AWS.

Resolves #741

This only change the ush/module-setup.sh,
and add a module file for AWS.

  • [x ] New feature (non-breaking change which adds functionality)

  • [x ] My code follows the style guidelines of this project

  • [x ] I have performed a self-review of my own code

@weihuang-jedi weihuang-jedi marked this pull request as ready for review April 29, 2024 16:12
@RussTreadon-NOAA
Copy link
Contributor

@weihuang-jedi , one request and two questions

Request: please update NOAA-EPIC:wei-epic-aws with recent commits to GSI develop.

Questions:

  • who would you like to review this PR? GSI PRs need two peer reviewers.
  • what's the plan for ensuring AWS GSI is routinely tested to ensure future GSI PRs do not break AWS GSI?

@weihuang-jedi
Copy link
Contributor Author

@RussTreadon-NOAA I have synced the fork, and will keep sync the EPIC side.

I do not know who should review this request: a) I do not who I should ask to, b) The EMC githup repo won't allow me to request one (only GDASApp site, which allow me to request).

I will be responsible to sync and test GSI on AWS, and like to build relation with GSI team to keep doing so.

Thanks,

Wei

@RussTreadon-NOAA
Copy link
Contributor

@weihuang-jedi , thank you for your reply. I'll discuss this PR with the GSI Handling Review team. When you have time please look through the GSI wiki to learn more about GSI code management policy.

@RussTreadon-NOAA
Copy link
Contributor

@weihuang-jedi , which NOAA RDHPCS machines do you have access to? Hera, Orion, Hercules, ...? Do you have WCOSS2 access?

@weihuang-jedi
Copy link
Contributor Author

weihuang-jedi commented May 6, 2024 via email

@RussTreadon-NOAA RussTreadon-NOAA self-requested a review May 6, 2024 18:09
@RussTreadon-NOAA
Copy link
Contributor

@weihuang-jedi - every GSI PR needs two peer reviewers. Members of the GSI Handling Review Team (Shun Liu, Cory Martin, Ming Hu, and I) can not be peer reviewers. That said, I assigned myself as a reviewer for this PR.

@weihuang-jedi
Copy link
Contributor Author

weihuang-jedi commented May 6, 2024 via email

@RussTreadon-NOAA
Copy link
Contributor

@russ Treadon - NOAA Federal @.> Could you please ask one more to review it? As said, I can not request someone to review it. Thank you.

On Mon, May 6, 2024 at 12:13 PM RussTreadon-NOAA @.
> wrote: @weihuang-jedi https://github.com/weihuang-jedi - every GSI PR needs two peer reviewers. Members of the GSI Handling Review Team (Shun Liu, Cory Martin, Ming Hu, and I) can not be peer reviewers. That said, I assigned myself as a reviewer for this PR. — Reply to this email directly, view it on GitHub <#742 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASMCH66UTD7SI3UESCWBTKTZA7B4VAVCNFSM6AAAAABG6UCXBSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJWGYZTAOJWGY . You are receiving this because you were mentioned.Message ID: @.***>

@weihuang-jedi , I know that you can not assign a reviewer.

Since we are transitioning to JEDI we have limited developer access to the authoritative GSI repository. This is why you can not assign reviewers. We ask developers to identify two or more peer reviewers and relay this information the the GSI Handling Review Team. A review team member will then assign the requested peer reviewers. Let me know who you want as the second reviewer and I'll assign this person to this PR.

GSI code management differs from other EMC repositories as described on the GSI wiki

The Unified Forecast System (UFS) will use the Joint Effort for Data assimilation Integration (JEDI) for its DA infrastructure. JEDI components, as they mature, will incrementally replace operational GSI-based components.

Given this, the focus of the NOAA-EMC/GSI repository has shifted to operational support during this transition. NOAA-EMC/GSI only supports current and planned operational NWS realizations of the GSI. Changes to NOAA-EMC/GSI must have a clear path to implementation with agreement from operational GSI application leads.

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments. Otherwise looks good.

modulefiles/gsi_noaacloud.intel.lua Outdated Show resolved Hide resolved
modulefiles/gsi_noaacloud.intel.lua Outdated Show resolved Hide resolved
@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 ctests

Install NOAA-EPIC:wei-epic-aws at c978d6f on Cactus. Run ctests using develop at a3a2633 as the control.

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr742/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_glbens
    Start 4: netcdf_fv3_regional
    Start 5: hafs_4denvar_glbens
    Start 6: hafs_3denvar_hybens
    Start 7: global_enkf
1/7 Test #4: netcdf_fv3_regional ..............***Failed  484.89 sec
2/7 Test #3: rrfs_3denvar_glbens ..............   Passed  489.35 sec
3/7 Test #7: global_enkf ......................   Passed  851.85 sec
4/7 Test #2: rtma .............................   Passed  969.38 sec
5/7 Test #6: hafs_3denvar_hybens ..............   Passed  1214.55 sec
6/7 Test #5: hafs_4denvar_glbens ..............***Failed  1215.14 sec
7/7 Test #1: global_4denvar ...................   Passed  1683.21 sec

71% tests passed, 2 tests failed out of 7

Total Test time (real) = 1683.28 sec

The following tests FAILED:
          4 - netcdf_fv3_regional (Failed)
          5 - hafs_4denvar_glbens (Failed)
Errors while running CTest

The netcdf_fv3_regional failure is due to

The memory for netcdf_fv3_regional_loproc_updat is 302544 KBs.  This has exceeded maximum allowable memory of 180435 KBs, resulting in Failure memthresh of the regression test.

This memory threshold test does not accurately measure application memory usage. This is not a fatal fail.

The hafs_4denvar_glbens failure is due to

The runtime for hafs_4denvar_glbens_hiproc_updat is 257.991474 seconds.  This has exceeded maximum allowable threshold time of 253.563486 seconds, resulting in Failure of timethresh2 the regression test.

A check of the gsi.x wall times does not find anomalous behavior

hafs_4denvar_glbens_hiproc_contrl/stdout:The total amount of wall time                        = 230.512260
hafs_4denvar_glbens_hiproc_updat/stdout:The total amount of wall time                        = 257.991474
hafs_4denvar_glbens_loproc_contrl/stdout:The total amount of wall time                        = 260.443699
hafs_4denvar_glbens_loproc_updat/stdout:The total amount of wall time                        = 261.777999

This is not a fatal fail.

@weihuang-jedi
Copy link
Contributor Author

@RussTreadon-NOAA Just checked, unified-env is not needed, as gsi-addon_env is sufficeint. Removed those commented out lines as well.
Thanks for catching these!
Wei

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve.

@RussTreadon-NOAA
Copy link
Contributor

@weihuang-jedi , this PR can not move forward until we have another peer review. Let me know who the second reviewer it and I will add this person as a reviewer.

@RussTreadon-NOAA
Copy link
Contributor

Thank you @CoryMartin-NOAA !

@RussTreadon-NOAA RussTreadon-NOAA merged commit 38bdb95 into NOAA-EMC:develop May 7, 2024
4 checks passed
@weihuang-jedi
Copy link
Contributor Author

Thank you @CoryMartin-NOAA and @RussTreadon-NOAA !

@weihuang-jedi weihuang-jedi deleted the wei-epic-aws branch May 7, 2024 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add modules to compile on AWS
3 participants