You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Regional configurations abort sporadically with a floating-point exception in subroutine a2b_ord2 in FV3/atmos_cubed_sphere/model/a2b_edge.F90 on Hera here:
The crash is a floating-point exception. There are only additions and multiplications, so the exception is probably from a NaN. This could be due to uninitialized memory, or due to not filling boundary conditions (which are initialized with signalling NaN).
Crashes seems to start with hash 8e7b61b in PR #2327 which adds a new omega calculation to the dynamical core. It's hard to be certain, since the crash doesn't happen every time.
Presently, the regression test system lacks any error checking, so it cannot distinguish between crashes like these, and a test's results changing.
To Reproduce:
Enable error checking in the workflow, so it'll pause on error instead of reporting the test as changing results.
Run the regression tests on Hera a few times.
Check for floating point exceptions in failed tests.
Additional context
Only tested on Hera.
The text was updated successfully, but these errors were encountered:
My PR description had an error: all regional configurations are affected, whether they have a nest or not.
SamuelTrahanNOAA
changed the title
sporadic floating point errors in FV3/atmos_cubed_sphere/a2b_edge.F90 for nested configurations
sporadic floating point errors in FV3/atmos_cubed_sphere/model/a2b_edge.F90 for nested configurations
Jul 11, 2024
Description
Regional configurations abort sporadically with a floating-point exception in subroutine a2b_ord2 in FV3/atmos_cubed_sphere/model/a2b_edge.F90 on Hera here:
The crash is a floating-point exception. There are only additions and multiplications, so the exception is probably from a NaN. This could be due to uninitialized memory, or due to not filling boundary conditions (which are initialized with signalling NaN).
Crashes seems to start with hash 8e7b61b in PR #2327 which adds a new omega calculation to the dynamical core. It's hard to be certain, since the crash doesn't happen every time.
Presently, the regression test system lacks any error checking, so it cannot distinguish between crashes like these, and a test's results changing.
To Reproduce:
Additional context
Only tested on Hera.
The text was updated successfully, but these errors were encountered: