Releases: mala-project/mala
v1.3.0 - Into the multi-GPU-niverse
New features
-
Multi-GPU inference: Models can now make predictions on an arbitrary number of GPUs
-
Multi-GPU training: Models can now be trained on an arbitrary number of GPUs
-
MALA now works with 2D materials, i.e., any system which is only periodic in two dimensions
-
Bispectrum descriptor calculation now possible in python
- This is route is significantly slower than LAMMPS, but can be helpful for developers who want to test the entire MALA modeling workflow without installing LAMMPS
-
Logging for network training has been overhauled and now allows for the logging of multiple metrics
-
(EXPERIMENTAL) Implementation of a mutual information based metric to replace/complement the ACSD
-
(EXPERIMENTAL) Implementation of a class for LDOS alignment to a reference energy value; this can be useful for models across multiple mass densities
Changes to API/user experience
- New parallelization parameters available:
use_lammps
- enable/disable LAMMPS (enabled by default, recommended for optimal performance, will automatically be disabled if no LAMMPS is found on the machine)use_atomic_density_formula
- enable the use of total energy evaluation based on a Gaussian representation (enabled if LAMMPS and GPU are enabled, recommended for optimal performance, details can be found in our paper on size transfer)use_ddp
- enable/disable DDP, i.e., Pytorch's distributed training scheme (disabled by default)
- Multiple LAMMPS/QE calculations can now be run in one directory
- Prior to this version, doing so would lead to problems due to the file based nature of these interfaces
- This allows for multiple simultaneous inferences in the same folder
- Class
SNAP
and all associated options are deprecated, useBispectrum
and associated options instead - Default units for reading from .cube files are now set to units commonly used within Quantum ESPRESSO, this should make it easier to avoid inconsistencies in data sampling
- ASE calculator
MALA
now reads models withload_run()
instead ofload_model
which is more consistent with the rest of MALA - Error reporting with the
Tester
class has been improved, all errors and energy values reported there are now consistently given in meV/atom - MALA calculators (LDOS, density, DOS) now also read energy contributions and forces from Quantum ESPRESSO output files, these can be accessed via properties
Fixes
- Updated various performance/accessibility issues of CI/CD
- Fixed compatability with newer Optuna versions
- Added missing docstrings
- Fixed shuffling interface, arbitrary numbers of shuffled snapshots can now be created without loss of information
- Fixed inconsistency of density dimensions when using directly from cube file
- Fixed error when using GPU graphs with arbitrary batch sizes
v1.2.1 - Minor bufixes
This release fixes some minor issues and bugs, and updates some of the meta information. It also serves as a point of reference for an upcoming scientific work.
Change notes:
- Updated MALA logos
- Updated Tester class to also give Kohn-Sham energies alongside LDOS calculations
- Updated CITATION.cff file to reflect new team members and scientific supervisors
- Fixed bug that would crash models trained with horovod when loaded for inference without horovod
- Fixed bug that would crash training when using Optuna+MPI for hyperparameter optimization (GPU compute graph usage had not been properly adaptable for this scenario)
- Deactivated pytorch profiling by default, can still be manually enabled
v1.2.0 - GPU and you
New features
- Production-ready inference options
- Full inference (from ionic configuration to observables) on either a single GPU or distributed across multiple CPU (multi-GPU support still in development)
- Access to (volumetric) observables within seconds
- Fast training speeds due to optimal GPU usage
- Training on large data sets through improved lazy-loading functionalitites and data shuffling routines
- Fast hyperparameter optimization through distributed optimizers (optuna) and training-free surrogate metrics (NASWOT/ACSD)
- Easy-to-use interface through single
Parameters
object for reproducibolity and modular design - Internal caching system for intermediate quantities (e.g. DOS, density, band energy) for improved performance
- Experimental features for advanced users:
- MinterPy: Polynomial interpolation based descriptors
- OpenPMD
- OF-DFT-MD interface to create initial configurations for ML based sampling
Change notes:
- Full (serial) GPU inference added
- MALA now operates on FP32
- Added functionality for data shuffling
- Added functionality for cached lazy loading
- Improved GPU usage during training
- Added convencience functions, e.g., for ACSD analysis
- Fixed several bugs across the code
- Overhauled documentation
v1.1.0 - (very late) Spring cleaning
Features
- Parallel preprocessing, network training and model inference
- Distributed hyperparameter optimization (Optuna) and distributed training-free network architecture optimization (NASWOT)
- Reproducibility through single
Parameters
object, easy interface to JSON for automated sweeps - Internal caching system for intermediate quantities (e.g. DOS, density, band energy) for improved performance
- Modular design
- OF-DFT-MD interface to create initial configurations for ML based sampling
Change notes:
- MALA now operates internally in Angstrom consistently
- Volumetric data that has been created with MALA v1.0.0 can still be used, but unit conversion has to be added to the scripts in question
- Implemented caching functionality
- The old post-processing API is still fully functional, but will not use the caching functions; instead, MALA now has a more streamlined API tying calculators to data
- More flexible data conversion methods
- Improved Optuna distribution scheme
- Implemented parallel total energy inference
- Reduced import time for MALA module
- Several smaller bugfixes
v1.0.0 - First major release (PyPI version)
Features
- Preprocessing of QE data using LAMMPS interface and LDOS parser (parallel via MPI)
- Networks can be created and trained using pytorch (arallel via horovod)
- Hyperparameter optimization using optuna, orthogonal array tuning and neural architecture search without training (NASWOT) supported
- optuna interface supports distributed runs and NASWOT can be run in parallel via MPI
- Postprocessing using QE total energy module (available as separate repository)
- Network inference parallel up to the total energy calculation, which currently is still serial.
- Reproducibility through single
Parameters
object, easy interface to JSON for automated sweeps - Modular design
Change notes:
- full integration of Sandia ML-DFT code into MALA (network architectures, misc code still open)
- Parallelization of routines:
- Preprocessing (both SNAP calculation and LDOS parsing)
- Network training (via horovod)
- Network inference (except for total energy)
- Technical improvements:
- Default parameter interface is now JSON based
- internal refactoring
v1.0.0 - First major release
Features
- Preprocessing of QE data using LAMMPS interface and LDOS parser (parallel via MPI)
- Networks can be created and trained using pytorch (arallel via horovod)
- Hyperparameter optimization using optuna, orthogonal array tuning and neural architecture search without training (NASWOT) supported
- optuna interface supports distributed runs and NASWOT can be run in parallel via MPI
- Postprocessing using QE total energy module (available as separate repository)
- Network inference parallel up to the total energy calculation, which currently is still serial.
- Reproducibility through single
Parameters
object, easy interface to JSON for automated sweeps - Modular design
Change notes:
- full integration of Sandia ML-DFT code into MALA (network architectures, misc code still open)
- Parallelization of routines:
- Preprocessing (both SNAP calculation and LDOS parsing)
- Network training (via horovod)
- Network inference (except for total energy)
- Technical improvements:
- Default parameter interface is now JSON based
- internal refactoring
v0.2.0 - Regular Update
Regular update of MALA. This release mostly updates the hyperparameter optimization capabilites of MALA and fixes some minor bugs. Changelog:
- Fixed installation instructions and OAT part of installation
- Improved and added examples; made LDOS based examples runnable
- Fixed direct string concat for file interaction and replaced it with
path
functions - Improved optuna hyperparameter optimization (ensemble objectives, band energy as validation loss, distributed optimization, performance study)
- Improved OAT and NASWOT implementation
- Fixed several things regarding documentation and citation
- Added check ensuring that QE-MD generated input files adhere to PBC
- Implemented visualization via tensorbard
- Stylistic improvements (import ordering fixed, TODOs converted to issues or resolved, replaced unncesseary
get_data_repo()
function - Added bump version
- Set up mirror to casus org and fix pipeline deployment issues when working from forks
Test data repository version: v1.1.0
v0.1.0 - Accelerating Finite-Temperature DFT with DNN
First alpha release of MALA. This code accompanies the publication of the same name (https://doi.org/10.1103/PhysRevB.104.035120).
Features:
- Preprocessing of QE data using LAMMPS interface and parsers
- Networks can be created and trained using pytorch
- Hyperparameter optimization using optuna
- experimental: orthogonal array tuning and neural architecture search without training supported
- Postprocessing using QE total energy module (available as separate repository)
Test data repository version: v0.1.0
v0.0.2
Added code from Sandia National Laboratories and Oak Ridge National Laboratory. Code developments will be merged beginning now.
v0.0.1
Current features:
- Preprocessing of QE data using LAMMPS interface and parsers
- Networks can be created and trained using pytorch
- Hyperparameter optimization using optuna
- experimental: orthogonal array tuning and neural architecture search without training supported
- Postprocessing using QE total energy module