Community calls

This page holds (temporarily) the agenda and minutes of the bi-weekly community conference calls.

April 5, 2022

Participants

Agenda

Developments updates

OSU microbenchmarks as a library tests are merged
Almost done with the extended syntax of valid_systems and valid_prog_environs (https://github.com/eth-cscs/reframe/pull/2479)
- We had to reimplement how valid systems/environments are selected in order to make it work with fixtures
- The implementation fixes also the bug with --skip-{system|prgenv}-check options when using fixtures.
Still WIP: Distributing a set of tests over multiple nodes (https://github.com/eth-cscs/reframe/pull/2458)

March 22, 2022

Participants

Vasileios Karakasis (CSCS) Victor Holanda (CSCS) Theofilos Manitaras (CSCS) Simon Bradford (Univ. Birmingham)

Agenda

We will delay 3.11.0 for two weeks (work got stuck due to limited availability of the team), but an rc release will be done today.
Draft PRs
- Syntax extensions for valid_systems and valid_prog_environs: https://github.com/eth-cscs/reframe/pull/2479
- OSU microbenchmarks library test (https://github.com/eth-cscs/reframe/pull/2421)
  - Still requires a bit of fine tuning, but it will soon be ready to merge.
- Generating node-pinned tests (https://github.com/eth-cscs/reframe/pull/2458)
  - We needed to address some limitations on how we can dynamically generate tests
  - https://github.com/eth-cscs/reframe/pull/2470
  - https://github.com/eth-cscs/reframe/pull/2474

February 22, 2022

Attendees

Vasileios Karakasis (CSCS)
Theofilos Manitaras (CSCS)
Eirini Koutsaniti (CSCS)
Jg Piccinali (CSCS)
Kenneth Hoste (HPC-UGent)
Åke Sandgren (Umeå Univ)
Rafael Sarmiento (CSCS)
Carlos Rosales (Amazon)
Richard Henwood (Arm)
Simon Branford (Univ. of Birmingham)

Agenda

We will skip 3.10.2 and target 3.11.0 for March 22; two dev releases in-between.
- Bug fixes
  - Fixed weird behaviour when overriding hooks within the same test (https://github.com/eth-cscs/reframe/pull/2436)
  - Fixed sub-configuration selection when running tests (https://github.com/eth-cscs/reframe/pull/2438)
  - Do not set up Spack shell support (https://github.com/eth-cscs/reframe/pull/2424)
- Enhancements
  - Control which attributes, variables or parameters can be logged (https://github.com/eth-cscs/reframe/pull/2428); current behaviour can cause problems with Logstash and lose records.
  - Remove pipeline timings from output.
- OSU library test and the associated CSCS tests PR (under review): https://github.com/eth-cscs/reframe/pull/2421
- Next sprint: https://github.com/eth-cscs/reframe/milestone/76
Community feedback
- Extension of the valid_systems and valid_prog_environs syntax is still work in progress. What if we supported basic compiler abstractions as in Spack here?
  - Vasileios: There are no plans for compiler auto-detection and auto-generation of the environments configuration section.
  - Kenneth: this could quickly become a time-consuming task, since also compiler versions, etc. are relevant
  - Kenneth: this seems like an opportunity for a common Python library that could be leveraged by ReFrame, Spack, EasyBuild, ...
    - kind of similar to archspec (cfr. -mtune & co options that archspec knows about, but compiler flags for OpenMP is out-of-scope there...
  - Richard: Delegate the compilation task fully onto Spack and use the compiler info to generate the ReFrame config on-the-fly. Then ReFrame tests are monkey-patched to parametrise them over the various specs.
- Use cases of running a test session continuously until a time limit is reached: https://github.com/eth-cscs/reframe/issues/619
  - could be used for burn-in testing, simulate user workload, ...
  - also related to exploring range of combinations for multi-node tests, since often not enough tests are generated to actually fill a system
Meeting frequency
AOB

February 8, 2022

Attendees

Vasileios Karakasis (CSCS)
Victor Holanda (CSCS)
Theofilos Manitaras (CSCS)
Jg Piccinali (CSCS)
Stefan Wolfsheimer (SURF0
Kenneth Hoste (HPC-UGent)
Åke Sandgren (Umeå Univ.)
Ben Fulton (Indiana Univ.)
Caspar van Leeuwen (SURF)
Rafael Sarmiento (CSCS)
Carlos Rosales (Amazon)

Agenda

Development updates
- ReFrame 3.10.0 is out: https://github.com/eth-cscs/reframe/releases/tag/v3.10.0
- ReFrame 3.10.1 planned for today: https://github.com/eth-cscs/reframe/milestone/74?closed=1
- Next sprint: https://github.com/eth-cscs/reframe/milestone/75
- Added new labels to tag each issue with the framework part it refers to
- We plan to migrate the repo under github.com/reframe-hpc.
Community feedback on use cases
- Do you use or plan to use ReFrame to test and deploy software stack, e.g., using Spack/EasyBuild?
  - Feedback: This is an interesting feature for both Spack and EasyBuild for exploring different build configurations, but it's not likely to be used for deploying the software stack.
- Towards relaxing valid_systems and valid_prog_environs: https://github.com/eth-cscs/reframe/issues/1987
  - Key challenge here is to integrate also the resources that can be defined in the configuration, which are accessed now through extra_rerources inside the test.
  - There are three types of system-related attributes: features, key/value properties and scheduler resources.
- Submit single node job automatically on every node of a reframe partition: https://github.com/eth-cscs/reframe/issues/2334
  - would be very useful to find "bad nodes" in a given reservation
  - automatically submit a separate copy of a test to each node
  - for now, nothing combinatorial (explodes quickly after 2 nodes...)
  - combinatorial combos could be pick N out of M possibilities at random, or strided throughout set of 100 nodes (1-10, 11-20, etc.)
    - selection mechanism is really needed when running 16-node tests out of 100 available nodes
  - Caspar: could tests somehow indicate that they want to use flexible allocation?
    - example: gpuburn to check thermal throtlling of GPUs ("hardware test")
    - tests that aim to validate working software are probably less interesting to run with flexible allocation
    - idea: --flex-alloc-singlenode=idle:testXYZ,testABC => only run these 2 specific single node tests across all nodes
  - Theo: Should the tests in such scenario share a single-stage directory so as to avoid redundant builds?
  - Åke: This case should be addressed by fixtures, where the build part of the test is a fixture and you only dynamically parametrise the run test.
Maintenance of scheduler backends
AOB

January 11, 2022

Agenda

Welcome and introductions
- Briefly introduce yourself and where are you using (or planning to use) ReFrame?
Development status
- Team & contributions
  - Core team (@ekouts, @rsarm, @teojgo, @vkarak, @victorusu)
  - Contributions are more than welcome!
- Development model
  - Release train model: A new release every two weeks; releases are not delayed; whatever is ready and merged gets released
  - Semantic versioning: <major>.<minor>.<patch>
    - Patch-level bumps (every two weeks): bug fixes and new features (no deprecations)
    - Minor version bumps (every 6–8 weeks): introduction of major features (deprecations are allowed, but backward compatibility is ensured)
    - Major version bumps: backward compatibility may be broken.
- Upcoming major features scheduled for 3.10.
  - Asynchronous builds (https://github.com/eth-cscs/reframe/pull/2194)
  - New test naming scheme (https://github.com/eth-cscs/reframe/pull/2355)
Outlook for HPC Test library
- Proof-of-concept in hpctestlib/ (documentation: https://reframe-hpc.readthedocs.io/en/stable/hpctestlib.html)
- Continue with creating library tests from our microbenchmarks
- Still unclear: community contributions, library location (different repo?), moving to stable
Discuss issues that need resolution (feature requests, bugs)
Discuss interesting use cases

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Community calls

April 5, 2022

Participants

Agenda

March 22, 2022

Participants

Agenda

February 22, 2022

Attendees

Agenda

February 8, 2022

Attendees

Agenda

January 11, 2022

Agenda

Clone this wiki locally