Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use --switches=<count> and max_switch_wait #197

Open
laraPPr opened this issue Oct 23, 2024 · 2 comments
Open

Use --switches=<count> and max_switch_wait #197

laraPPr opened this issue Oct 23, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@laraPPr
Copy link
Collaborator

laraPPr commented Oct 23, 2024

This should be implemented similarly to the the memory hook.

@laraPPr laraPPr added the enhancement New feature or request label Oct 23, 2024
@casparvl
Copy link
Collaborator

Proposed approach:

  • Add an 'extras' field in the config where system owners can declare how many nodes they have on a single leaf swtich
  • Add a 'resources' field in the config to request all nodes from a single switch
  • Add to eessi_mixin: logic to check how many nodes there are to a switch, and if the requested node count is lower than that, ask for a single switch.

@casparvl
Copy link
Collaborator

Good to note: @satishskamath experimented and noticed that if you ask e.g. 200 nodes on a single switch, but your system has 16 nodes per switch, the job will not be rejected (which is good), but you will wait for the max_switch_wait time (which is bad). That's why we proposed the above approach and force the end-user to configure this in the ReFrame config.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants