Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/mask NaNs in training loss function #56

Merged
merged 15 commits into from
Nov 22, 2024

Conversation

sahahner
Copy link
Member

@sahahner sahahner commented Oct 2, 2024

Variables with missing values that are imputed by the imputer should not be considered in the loss.

The NaN masks are prepared in the imputer. The remapper contains a new function to remap the NaN masks from the imputer.

This goes together with PR #72 from anemoi-training.

@codecov-commenter
Copy link

codecov-commenter commented Oct 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.85%. Comparing base (0e03d33) to head (15cf7b9).
Report is 1 commits behind head on develop.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop      #56   +/-   ##
========================================
  Coverage    99.85%   99.85%           
========================================
  Files           23       23           
  Lines         1350     1374   +24     
========================================
+ Hits          1348     1372   +24     
  Misses           2        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@floriankrb
Copy link
Member

This functionality seems to be related to ecmwf/anemoi-training#79
Perhaps the masks.py created by @JPXKQX should move in anemoi-models and a [refactored version of] OutputMask be used here?

@JPXKQX
Copy link
Member

JPXKQX commented Oct 15, 2024

I see some similarities between the output masking and the post-processors, but the part that doesn't fit is that the post-processors are only applied at the end of the rollout. Instead, the masking is called not only at the end, but also in between all the rollout steps (to roll out the boundary forcing). So I don't know if it's better to include it as a special post-processor or leave it in the anemoi-training.

I would say that we can do the loss masking here similar to the imputer, but I think the masking should remain in anemoi-training.

@sahahner sahahner marked this pull request as ready for review November 13, 2024 09:09
@sahahner sahahner self-assigned this Nov 13, 2024
@sahahner sahahner added the enhancement New feature or request label Nov 13, 2024
Copy link
Collaborator

@jakob-schloer jakob-schloer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After testing and discussing with @sahahner, I approve the changes.

@sahahner sahahner merged commit fd2bcf1 into develop Nov 22, 2024
121 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants