-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
venado system specs #548
base: develop
Are you sure you want to change the base?
venado system specs #548
Conversation
@rfhaque where is the PR checklist template? Those are helpful for reviewing. In particular, for adding a new system, we also need: (@slabasan, if those are not listed in the PR template right now, we probably want to add them) |
repo/caliper/for_aarch64.patch
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daboehme Can you please see the patch for Caliper on Arm? We are working with a Grace Hopper machine. It would be great if you can fix this in Caliper and put up a PR to benchpark to remove the patch (if this PR is already merged).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fixed in Caliper v2.12.1, which is the current default in upstream spack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this patch @daboehme
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one was only required in v2.9.0, which is quite outdated at this point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rfhaque can you please remove the patches and use the newest Caliper instead? Is there a reason you are not doing so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pearce8 This is just a temporary fix, till we pin the latest spack version.
repo/caliper/package.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
|
||
|
||
class LanlVenado(System): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rfhaque Do they call them "clusters" or "partitions"? Do you also need to set "partition" in the slurm script to correspond to which partition of the machine you are wanting to use?
@alexrlongne might know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pearce8 While technically these are hardware partitions, venado slurm defines its own "partitions" (standard, debug, gpu etc) corresponding to some set of resources/qos
return { | ||
"cuda_arch": "90", | ||
"default_cuda_version": self.spec.variants["cuda"][0], | ||
"extra_batch_opts": '"-A llnl_ai_g -pgpu"', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rfhaque what does this extra_batch_opts do?
Description
This PR adds the system config specs for the LANL Venado system.
Dependencies: FIXME:Add a list of any dependencies.
Fixes issue(s): FIXME:Add list of relevant issues.
Type of Change
Checklist:
If adding/modifying a system:
system.py
file.github/workflows
If adding/modifying a benchpark:
application.py
and (maybe)package.py
under a new directoryfor this benchmark
section
If adding/modifying a experiment:
experiment.py
under existing directory for specific benchmarkIf adding/modifying core functionality:
.github/workflows
and.gitlab/ci
unit tests (if needed)