Skip to content

Releases: ecmwf/anemoi-training

0.3.0 - Loss & Callback Refactors

14 Nov 17:56
64915e6
Compare
Choose a tag to compare

Config Updates

Due to large refactors of the loss functions and callbacks, it is highly advised you reset your configs.
anemoi-training config generate --override

In particular, the training and diagnostics configurations have greatly changed.

Added

  • training_loss
  • validation_metrics
  • variable_loss_scaling

Removed

  • loss_scaling

Changed

  • callbacks

What's Changed

New Contributors

Full Changelog: 0.2.2...0.3.0

0.2.2 - Maintenance: pin python <3.13

28 Oct 14:48
de98029
Compare
Choose a tag to compare

What's Changed

This release pins python <3.13 due to a missing dependency distribution. Support for python 3.13 is expected in the near future.

Changed

  • Lock python version <3.13 #107

Commit Log

New Contributors

Full Changelog: https://github.com/ecmwf/anemoi-training/blob/develop/CHANGELOG.md

0.2.1 - Bugfix: resuming mlflow runs

24 Oct 14:09
84fd53b
Compare
Choose a tag to compare

What's Changed

Added

  • Mlflow-sync to include new tag for server to server syncing #83
  • Mlflow-sync to include functionality to resume and fork server2server runs #83
  • Rollout training for Limited Area Models. #79
  • Feature: New Boolean1DMask class. Enables rollout training for limited area models. #79

Fixed

  • Mlflow-sync to handle creation of new experiments in the remote server #83
  • Fix for multi-gpu when using mlflow due to refactoring of _get_mlflow_run_params function #99
  • ci: fix pyshtools install error #100

Changed

  • Update copyright notice

Commit Log

Full Changelog: https://github.com/ecmwf/anemoi-training/blob/develop/CHANGELOG.md#021---bugfix-resuming-mlflow-runs---2024-10-24

0.2.0 - Feature release

16 Oct 11:20
Compare
Choose a tag to compare

What's Changed

This release brings some changes to the default config. Use the anemoi-training config CLI to re-generate your default config.

Added

  • Add anemoi-transform link to documentation
  • Codeowners file (#56)
  • Changelog merge strategy (#56)

Miscellaneous

  • Introduction of remapper to anemoi-models leads to changes in the data indices. Some preprocessors cannot be applied in-place anymore.

Functionality

  • Enable the callback for plotting a histogram for variables containing NaNs
  • Enforce same binning for histograms comparing true data to predicted data
  • Fix: Inference checkpoints are now saved according the frequency settings defined in the config #37
  • Feature: Add configurable models #50
  • Feature: Authentication support for mlflow sync - #51
  • Feature: Support training for datasets with missing time steps #48
  • Feature: AnemoiMlflowClient, an mlflow client with authentication support #86
  • Long Rollout Plots

Fixed

  • Fix TypeError raised when trying to JSON serialise datetime.timedelta object - #43
  • Bugfixes for CI (#56)
  • Fix mlflow subcommand on python 3.9 #62
  • Show correct subcommand in MLFlow - Addresses #39 in #61
  • Fix interactive multi-GPU training #82
  • Allow 500 characters in mlflow logging #88

Changed

  • Updated configuration examples in documentation and corrected links - #46
  • Remove credential prompt from mlflow login, replace with seed refresh token via web - #78
  • Update CODEOWNERS

Commit Log

New Contributors

Full Changelog: https://github.com/ecmwf/anemoi-training/blob/develop/CHANGELOG.md#020---feature-release---2024-10-16

0.1.0 - Anemoi training - First release

16 Aug 15:28
af48387
Compare
Choose a tag to compare

What's Changed

Added

Subcommands

  • Subcommand for training anemoi-training train
  • Subcommand for config generation of configs
  • Subcommand for mlflow: login and sync
  • Subcommand for checkpoint handling

Functionality

  • Searchpaths for Hydra configs, to enable configs in CWD, ANEMOI_CONFIG_PATH env, and .config/anemoi/training in addition to package defaults
  • MlFlow token authentication
  • Configurable pressure level scaling

Continuous Integration / Deployment

  • Downstream CI to test all dependencies with changes
  • Changelog Status check
  • Readthedocs PR builder
  • Changelog Release Updater Workflow

Miscellaneous

  • Extended ruff Ruleset
  • Added Docsig pre-commit hook
  • __future__ annotations for typehints
  • Added Typehints where missing
  • Added Changelog
  • Correct errors in callback plots
  • fix error in the default config
  • example slurm config

Changed

Move to Anemoi Ecosystem

  • Fixed PyPI packaging
  • Use of Anemoi models
  • Use of Anemoi graphs
  • Adjusted tests to work with new Anemoi ecosystem
  • Adjusted configs to reasonable common defaults

Functionality

  • Changed hardware-specific keys from configs to ??? to trigger "missing"
  • __len__ of NativeGridDataset
  • Configurable dropout in attention layer

Docs

  • First draft on Read the Docs
  • Fixed docstrings

Miscellaneous

  • Moved callbacks into folder to facilitate future refactor
  • Adjusted PyPI release infrastructure to common ECMWF workflow
  • Bumped versions in Pre-commit hooks
  • Fix crash when logging hyperparameters with missing values in the config
  • Fixed "null" tracker metadata when tracking is disabled, now returns an empty dict
  • Pinned numpy<2 until we can test all migration
  • ci: path ignore of docs for downstream ci
  • ci: make python QA reusable
  • ci: permissions on changelog updater

Removed

  • Dependency on mlflow-export-import
  • Specific user configs
  • len function of NativeGridDataset as it lead to bugs

Release Work

Original Contributions in AIFS by

Full Changelog: https://github.com/ecmwf/anemoi-training/blob/develop/CHANGELOG.md#0.1.0