Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTFS DeepMD #4

Open
1 of 2 tasks
markcoletti opened this issue Oct 24, 2023 · 2 comments
Open
1 of 2 tasks

RTFS DeepMD #4

markcoletti opened this issue Oct 24, 2023 · 2 comments
Assignees
Labels
question Further information is requested

Comments

@markcoletti
Copy link
Contributor

markcoletti commented Oct 24, 2023

Take a deep dive into the DeepMD code-base. We need to understand fundamentally how it works.

@asedova
Copy link
Contributor

asedova commented Oct 24, 2023

Found the dataloader in https://github.com/deepmodeling/deepmd-kit/blob/master/deepmd/train/trainer.py. It uses https://github.com/deepmodeling/deepmd-kit/tree/master/deepmd/utils/random.py and data_system.py in that same utils dir. random.py is just a wrapper around an older numpy random function (RandomState) which is technically deprecated, but there is a seed set that is passed in from the input json file that should work ok. Otherwise the frames are just chosen using this RNG (which is also strange since you would think you would want to train on ALL the frames, not just a random subset, that could potentially have repetitions?). But anyway, it does seem like at this DeePMD level, the data loading should be deterministic. We still may have some type of streaming happening at the TF or Horovod level though.

@asedova
Copy link
Contributor

asedova commented Oct 24, 2023

Need to next check the TF/horovod levels of distributed training to see if there may be some task stealing or asynchronous data streaming or something.

@asedova asedova added the question Further information is requested label Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants