Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement on-the-fly descriptor calculation #630

Draft
wants to merge 17 commits into
base: develop
Choose a base branch
from

Conversation

RandomDefaultUser
Copy link
Member

@RandomDefaultUser RandomDefaultUser commented Jan 7, 2025

This PR gives MALA the capability to compute descriptors on the fly. In doing so, it allows for more refined hyperparameter optimization of descriptor related hyperparameters, since the training-free methods we initially envisioned may not (yet) be enough in many cases. The starting point for this on-the-fly methods are the .json files the DataConverter class can generate.

Since this is a pretty fundamental change in the MALA pipeline, it requires thorough testing. Namely, the following capabilities have to be implemented:

  • Single CPU/GPU training:
    • RAM based training
    • RAM based training w/ checkpointing
    • Lazy loading training
    • Lazy loading training w/ checkpointing
    • Lazy loading prefetch training
    • Lazy loading prefetch training w/ checkpointing
  • DDP training:
    • RAM based training
    • RAM based training w/ checkpointing
    • Lazy loading training
    • Lazy loading training w/ checkpointing
  • Adapt Tester class
  • Write Example
  • Write Documentation
  • Rename "additional data" in the DataConverter class
  • Add Shuffling to this on-the-fly calculation
  • Integrate new functionality in test suite
  • Add an automatic detection of provided data

@RandomDefaultUser RandomDefaultUser added the enhancement New feature or request label Jan 7, 2025
@RandomDefaultUser RandomDefaultUser self-assigned this Jan 7, 2025
@RandomDefaultUser RandomDefaultUser marked this pull request as draft January 7, 2025 11:50
@RandomDefaultUser
Copy link
Member Author

I just looked into the DDP side of things. The problem here seems to be that DDP allocates the GPU for its own usage, so when LAMMPS wants to use them, they are not free. This should still be solvable, but requires more modification in the code. I will first address the other open issues and address this issue later on, maybe in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant