-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mace_mp medium performance benchmark #647
base: develop
Are you sure you want to change the base?
Conversation
@hatemhelal, amazing thank you. I would say we could also add a small protien benchmark as an option. This would cover the low density case compared to diamond. Note that in the past I have found that pytorch internal benchmark tools |
Glad you brought this up, the synchronize call in the benchmarked function is trying to replicate what is done in the pytorch utils. I experimented a bit with different number of warmup rounds and the minimum number of samples and ended up at the values in this PR as they seemed to run reasonably fast and the time / step had a small std deviation. |
good suggestion and I can add this soon. |
I ran these benchmarks on both h100 and a100 and plotted the results in this notebook: I wasn't too sure about adding this notebook to the main repo but @ilyes319 let me know if you think this would be useful to add and I can add it to this PR. |
I think you need to indicate what kind of MACE model this was. |
it is very dangerous to put out speed test "of mace" in general without specifying which mace model it is. |
Yes it was a bit buried in the code, I've retitled the PR and added a note that this is using the mace-mp-medium model. |
it doesn't say anything in your notebook. maybe it needs to be above the timings table? |
I've updated the metrics to include compiled mace-mp-medium and updated the notebook |
Can you add to the book what the system is. I know you give nodes and edges, but that's opaque to many users. People expect time to mainly depend only on number of atoms, but in reality density also matters. It would also be nice to explicitly say what the main architecture parameters are: cutoff, number of channels, L. Again, it follows from "medium" but casual readers will not know, and it's a chance to educate. |
I am pretty sure I did 5k system sizes on fp64 on A100 stopping it at ~3k may give the wrong impression. |
This PR adds dual-use unittests and benchmarks to begin addressing #645. It would be helpful to have feedback on how representative these benchmarks are compared to production MD runs with either LAMMPS or OpenMM.
The benchmarks run as normal unittests and I found that adding the option to run the benchmark on the compiled variant significantly increased the time it takes to collect the performance metrics.
Running the benchmarks:
to include metrics with torch.compile:
After saving some metrics, invoking the test file as a script will output the collected metrics as a csv:
The benchmark currently only runs the mace-mp-medium variant but this could be expanded to include other models if that would be considered useful.