Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mace_mp medium performance benchmark #647

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

hatemhelal
Copy link
Contributor

@hatemhelal hatemhelal commented Oct 21, 2024

This PR adds dual-use unittests and benchmarks to begin addressing #645. It would be helpful to have feedback on how representative these benchmarks are compared to production MD runs with either LAMMPS or OpenMM.

The benchmarks run as normal unittests and I found that adding the option to run the benchmark on the compiled variant significantly increased the time it takes to collect the performance metrics.

Running the benchmarks:

pytest tests/test_benchmark.py --benchmark-save=<some name>

to include metrics with torch.compile:

MACE_FULL_BENCH =1 pytest tests/test_benchmark.py --benchmark-save=<some name>

After saving some metrics, invoking the test file as a script will output the collected metrics as a csv:

python tests/test_benchmark.py

The benchmark currently only runs the mace-mp-medium variant but this could be expanded to include other models if that would be considered useful.

@ilyes319
Copy link
Contributor

ilyes319 commented Oct 21, 2024

@hatemhelal, amazing thank you. I would say we could also add a small protien benchmark as an option. This would cover the low density case compared to diamond.

Note that in the past I have found that pytorch internal benchmark tools import torch.utils.benchmark as benchmark to provide more reliable results compared to MD runs.

@hatemhelal
Copy link
Contributor Author

Note that in the past I have found that pytorch internal benchmark tools import torch.utils.benchmark as benchmark to provide more reliable results compared to MD runs.

Glad you brought this up, the synchronize call in the benchmarked function is trying to replicate what is done in the pytorch utils. I experimented a bit with different number of warmup rounds and the minimum number of samples and ended up at the values in this PR as they seemed to run reasonably fast and the time / step had a small std deviation.

@hatemhelal
Copy link
Contributor Author

we could also add a small protien benchmark as an option

good suggestion and I can add this soon.

@hatemhelal
Copy link
Contributor Author

I ran these benchmarks on both h100 and a100 and plotted the results in this notebook:
https://github.com/hatemhelal/mace/blob/bench-results/benchmarks/bench_plots.ipynb

I wasn't too sure about adding this notebook to the main repo but @ilyes319 let me know if you think this would be useful to add and I can add it to this PR.

@gabor1
Copy link
Collaborator

gabor1 commented Oct 21, 2024

I think you need to indicate what kind of MACE model this was.

@gabor1
Copy link
Collaborator

gabor1 commented Oct 21, 2024

it is very dangerous to put out speed test "of mace" in general without specifying which mace model it is.

@hatemhelal hatemhelal changed the title Add mace performance benchmark Add mace_mp medium performance benchmark Oct 21, 2024
@hatemhelal
Copy link
Contributor Author

Yes it was a bit buried in the code, I've retitled the PR and added a note that this is using the mace-mp-medium model.

@gabor1
Copy link
Collaborator

gabor1 commented Oct 21, 2024

it doesn't say anything in your notebook. maybe it needs to be above the timings table?

@hatemhelal
Copy link
Contributor Author

I've updated the metrics to include compiled mace-mp-medium and updated the notebook

@gabor1
Copy link
Collaborator

gabor1 commented Oct 24, 2024

Can you add to the book what the system is. I know you give nodes and edges, but that's opaque to many users. People expect time to mainly depend only on number of atoms, but in reality density also matters.

It would also be nice to explicitly say what the main architecture parameters are: cutoff, number of channels, L. Again, it follows from "medium" but casual readers will not know, and it's a chance to educate.

@alinelena
Copy link
Contributor

I am pretty sure I did 5k system sizes on fp64 on A100 stopping it at ~3k may give the wrong impression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants