-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Frontier machine/compilers following system update #6579
Conversation
grnydawn
commented
Sep 2, 2024
- Retain current software modules instead of updating to the latest versions to prioritize reliability.
- Add linker options to use GCC 12.2, addressing linker errors.
- Utilize Fortran linker to resolve additional linker errors.
- Replace hipcc with mpicxx for MPICXX macro in the GPU compiler definitions.
- Adjust compiler priority to prioritize reliability over performance.
- Temporarily comment out ADIOS2 configurations
|
How was this tested? |
The following table summarizes the test results. Test results are from running the e3sm_developer test suite without debug cases. It took an excessive amount of time to build debug cases with the crayclang compiler.
|
The crayclang compiler (both the current and latest versions) has internal compiler issues(segfault from optcg compiler internal module) and excessive compile times : OLCF tickets for the latest version: OLCFHELP-19210, OLCFHELP-19356, and OLCFHELP-19435. The amdclang compiler has a segfault issue during the compilation of some test cases. Essentially, AMD has asked us to wait until they release a new Fortran compiler based on LLVM. |
Try testing with this suite: e3sm_gpucxx |
@rljacob , I got following test result with e3sm_gpucxx test suite.
All scream failures except amdclanggpu compiler case are caused by the following error:
|
The following cmake statements in "$E3SM/components/cmake/build_eamxx.cmake" caused the error:
The only cmake file that Scream currently supports for Frontier is "frontier-scream-gpu.cmake". So, none of the above machine/compiler names worked. |
@grnydawn , eamxx won't work if there's no eamxx machine file. Eamxx clears all E3SM settings and manages its own flags, so you will end up with no compiler flags if there's no eamxx machine file. |
Eamxx is currently using it's own machine files for frontier, so you probably don't need to worry about this for now. |
What stops EAMxx from using the regular E3SM build system components? |
@rljacob , a long time ago, we thought eamxx would need to manage it's own flags, even when part of a CIME build. I think we have since decided that we should just use the CIME system but I haven't gotten around to changing things yet. |
- Retain current software modules instead of updating to the latest versions. - Add linker options to use GCC 12.2, addressing linker errors. - Utilize Fortran linker to resolve additional linker errors. - Replace hipcc with mpicxx for MPICXX macro in the GPU compiler definitions. - Adjust compiler priority to prioritize reliability over performance. - Temporarily comment out ADIOS2 configurations
54f3748
to
25fe732
Compare
If there are no objections, I will merge this PR to the next branch, and, if no new issues arise, into the master branch. |
Update Frontier machine/compilers following system update [BFB] no code change. retain current software modules