Skip to content

Commit

Permalink
Merging main into osu with latest commits.
Browse files Browse the repository at this point in the history
Merge branch 'main' into osu
  • Loading branch information
Satish Kamath committed Nov 8, 2023
2 parents b83afa4 + 276b435 commit 3a32f0e
Show file tree
Hide file tree
Showing 20 changed files with 371 additions and 487 deletions.
28 changes: 24 additions & 4 deletions .github/workflows/pip_install.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,33 @@ on: [push, pull_request, workflow_dispatch]
permissions: read-all
jobs:
test_pip_install:
runs-on: ubuntu-22.04
# ubuntu <= 20.04 is required for python 3.6
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
python: ['3.6', '3.7', '3.8', '3.9', '3.10', '3.11']
python-version: ['3.6', '3.7', '3.8', '3.9', '3.10', '3.11']
steps:
- name: Check out software-layer repository
uses: actions/checkout@93ea575cb5d8a053eaa0ac8fa3b40d7e05a33cc8 # v3.1.0
with:
persist-credentials: false

- name: Set up Python
uses: actions/setup-python@61a6322f88396a6271a6ee3565807d608ecaddd1 # v4.7.0
with:
python-version: ${{ matrix.python-version }}

- name: Install setuptools
run: |
if [[ "${{ matrix.python-version }}" == "3.6" ]]; then
# system installed setuptools version in RHEL8 and CO7
python -m pip install --user setuptools==39.2.0
fi
- name: Install ReFrame
run: |
pip install --user ReFrame-HPC
python -m pip install --user ReFrame-HPC
- name: Install EESSI test suite with 'pip install'
run: |
Expand All @@ -26,8 +39,15 @@ jobs:
python setup.py sdist
ls dist
pip install --user dist/eessi*.tar.gz
python -m pip install --user dist/eessi*.tar.gz
find $HOME/.local
# make sure we are not in the source directory
cd $HOME
python --version
python -m pip --version
python -c 'import setuptools; print("setuptools", setuptools.__version__)'
python -c 'import eessi.testsuite.utils'
python -c 'import eessi.testsuite.tests.apps'
127 changes: 16 additions & 111 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,108 +1,13 @@
# test-suite
A portable test suite for software installations, using ReFrame

## Getting started
A portable test suite for software installations, using ReFrame.

- install ReFrame >=4.0
## Documentation

- install the test suite using
For documentation on installing, configuring, and using the EESSI test suite, see https://eessi.io/docs/test-suite/.

```bash
pip install git+https://github.com/EESSI/test-suite.git
```

Alternatively, you can clone the repository

```bash
git clone [email protected]:EESSI/test-suite.git
```

- add the path of the `test-suite` directory to your ``$PYTHONPATH``

- create a site configuration file

- should look similar to `test-suite/config/settings_example.py`

- run the tests

The example below runs a gromacs simulation using GROMACS modules available
in the system, in combination with all available system:partitions as
defined in the site config file, using 1 full node (`--tag 1_node`, see `SCALES`
in `constants.py`). This example assumes that you have cloned the
repository at `/path/to/EESSI/test-suite`.

```
cd /path/to/EESSI/test-suite
module load ReFrame/4.2.0
export PYTHONPATH=$PWD:$PYTHONPATH
reframe \
--config-file <path_to_site_config_file> \
--checkpath eessi/testsuite/tests/apps \
--tag CI --tag 1_node \
--run --performance-report
```
## Development

## Configuring GPU/non-GPU partitions in your site config file:

- running GPU jobs in GPU nodes
- add `'features': [FEATURES[GPU]]` to the GPU partitions
- add `'extras': {GPU_VENDOR: GPU_VENDORS[NVIDIA]}` to the GPU partitions (or
`INTEL` or `AMD`, see `GPU_VENDORS` in `constants.py`)

- running non-GPU jobs in non-GPU nodes
- add `'features': [FEATURES[CPU]]` to the non-GPU partitions

- running both GPU jobs and non-GPU jobs in GPU nodes
- add `'features': [FEATURES[CPU], FEATURES[GPU]]` to the GPU partitions

- setting the number of GPUS per node <x> for a partition:
```
'access': ['-p <partition_name>'],
'devices': [
{'type': DEVICE_TYPES[GPU], 'num_devices': <x>}
],
```
- requesting GPUs per node for a partition:
```
'resources': [
{
'name': '_rfm_gpu',
'options': ['--gpus-per-node={num_gpus_per_node}'],
}
],
```
## Changing the default test behavior on the cmd line
- specifying modules
- `--setvar modules=<modulename>`
- specifying valid systems:partitions
- `--setvar valid_systems=<comma-separated-list>`
Note that setting `valid_systems` on the cmd line disables filtering of
valid systems:partitions in the hooks, so you have to do the filtering
yourself.
- overriding tasks, cpus, gpus
- `--setvar num_tasks_per_node=<x>`
- `--setvar num_cpus_per_task=<y>`
- `--setvar num_gpus_per_node=<x>`
- setting additional environment variables
- `--setvar env_vars=<envar>:<value>`
Note that these override the variables for _all_ tests in the test suite that
respect those variables. To override a variable only for specific tests, one
can use the `TEST.VAR` syntax. For example, to run the `GROMACS_EESSI` test with the
module `GROMACS/2021.6-foss-2022a`:
- `--setvar GROMACS_EESSI.modules=GROMACS/2021.6-foss-2022a`
## Developers
If you want to install the EESSI test suite from a branch, you can either
install the feature branch with `pip`, or clone the Github repository and check
out the feature branch.
Expand All @@ -123,8 +28,9 @@ pip install git+https://github.com/<someuser>/test-suite.git@branchname
```

### Check out a feature branch from a fork
We'll assume you already have a local clone of the official test-suite
repository, called 'origin'. In that case, executing `git remote -v`, you

We'll assume you already have a local clone of the official `test-suite`
repository, called '`origin`'. In that case, executing `git remote -v`, you
should see:

```bash
Expand All @@ -146,10 +52,10 @@ With `git remote -v` you should now see the new remote:

```bash
$ git remote -v
origin [email protected]:EESSI/test-suite.git (fetch)
origin [email protected]:EESSI/test-suite.git (push)
casparvl [email protected]:casparvl/test-suite.git (fetch)
casparvl [email protected]:casparvl/test-suite.git (push)
origin [email protected]:EESSI/test-suite.git (fetch)
origin [email protected]:EESSI/test-suite.git (push)
casparvl [email protected]:casparvl/test-suite.git (fetch)
casparvl [email protected]:casparvl/test-suite.git (push)
```

Next, we'll fetch the branches that `casparvl` has in his fork:
Expand All @@ -161,10 +67,8 @@ $ git fetch casparvl
We can check the remote branches using
```bash
$ git branch --list --remotes
casparvl/gromacs_cscs
casparvl/example_branch
casparvl/main
casparvl/setuppy
casparvl/updated_defaults_pr11
origin/HEAD -> origin/main
origin/main
```
Expand All @@ -173,14 +77,15 @@ $ git branch --list --remotes
this command).

Finally, we can create a new local branch (`-c`) and checkout one of these
feature branches (e.g. `setuppy` from the remote `casparvl`). Here, we've
picked `local_setuppy_branch` as the local branch name:
feature branches (e.g. `example_branch` from the remote `casparvl`). Here, we've
picked `my_own_example_branch` as the local branch name:
```bash
$ git switch -c local_setuppy_branch casparvl/setuppy
$ git switch -c my_own_example_branch casparvl/example_branch
```

While the initial setup is a bit more involved, the advantage of this approach
is that it is easy to pull in updates from a feature branch using `git pull`.

You can also push back changes to the feature branch directly, but note that
you are pushing to the Github fork of another Github user, so _make sure they
are ok with that_ before doing so!
19 changes: 19 additions & 0 deletions RELEASE_NOTES
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
This file contains a description of the major changes to the EESSI test suite.
For more detailed information, please see the git log.

v0.1.0 (5 October 2023)
-----------------------

This is the first release of the EESSI test suite.

It includes:

* A well-structured `eessi.testsuite` Python package that provides constants, utilities, hooks, and tests, which can be installed with "`pip install`".
* Tests for GROMACS and TensorFlow in `eessi.testsuite.tests.apps` that leverage the functionality provided by `eessi.testsuite.*`.
* Examples of ReFrame configuration files for various systems in the `config` subdirectory.
* A `common_logging_config()` function to facilitate the ReFrame logging configuration.
* A set of standard device types and features that can be used in the partitions section of the ReFrame configuration file.
* A set of tags (CI + scale) that can be used to filter checks.
* Scripts that show how to run the test suite.

For documentation, see https://eessi.io/docs/test-suite .
92 changes: 21 additions & 71 deletions config/aws_citc.py
Original file line number Diff line number Diff line change
@@ -1,29 +1,23 @@
# This is an example configuration file
# WARNING: for CPU autodetect to work correctly you need to
# 1. Either use ReFrame >= 4.3.3 or temporarily change the 'launcher' for each partition to srun
# 2. Either use ReFrame >= 4.3.3 or run from a clone of the ReFrame repository

# Note that CPU autodetect currently does not work with this configuration file on AWS.
# This is because there is no system mpirun, and the CPU autodetection doesn't load any modules
# that would make an mpirun command available (as normal multiprocessing tests would).
# In order to do CPU autodetection, you'll need to change the launcer to srun:
# 'launcher = srun'
# You can run the CPU autodetect by listing all tests (reframe -l ...)
# and then, once all CPUs are autodetected, change the launcher back to mpirun for a 'real' run (reframe -r ...)
# Without this, the autodetect job fails because
# 1. A missing mpirun command
# 2. An incorrect directory structure is assumed when preparing the stagedir for the autodetect job

# Another known issue is that CPU autodetection fails if run from an actual installation of ReFrame.
# It only works if run from a clone of their Github Repo. See https://github.com/reframe-hpc/reframe/issues/2914
# Related issues
# 1. https://github.com/reframe-hpc/reframe/issues/2926
# 2. https://github.com/reframe-hpc/reframe/issues/2914

from os import environ, makedirs
import os

from eessi.testsuite.common_config import common_logging_config
from eessi.testsuite.constants import FEATURES

# Get username of current user
homedir = environ.get('HOME')

# This config will write all staging, output and logging to subdirs under this prefix
reframe_prefix = f'{homedir}/reframe_runs'
log_prefix = f'{reframe_prefix}/logs'

# ReFrame complains if the directory for the file logger doesn't exist yet
makedirs(f'{log_prefix}', exist_ok=True)
# Override with RFM_PREFIX environment variable
reframe_prefix = os.path.join(os.environ['HOME'], 'reframe_runs')

# AWS CITC site configuration
site_configuration = {
Expand All @@ -32,7 +26,7 @@
'name': 'citc',
'descr': 'Cluster in the Cloud build and test environment on AWS',
'modules_system': 'lmod',
'hostnames': ['mgmt', 'login', 'fair-mastodon*'],
'hostnames': ['mgmt', 'login', 'fair-mastodon*'],
'prefix': reframe_prefix,
'partitions': [
{
Expand Down Expand Up @@ -110,58 +104,22 @@
'access': ['--constraint=shape=c7g.4xlarge', '--export=NONE'],
'descr': 'Graviton3, 16 cores, 32 GiB',
},
]
},
],
]
},
],
'environments': [
{
'name': 'default',
'cc': 'cc',
'cxx': '',
'ftn': '',
},
],
'logging': [
{
'level': 'debug',
'handlers': [
{
'type': 'stream',
'name': 'stdout',
'level': 'info',
'format': '%(message)s'
},
{
'type': 'file',
'prefix': f'{log_prefix}/reframe.log',
'name': 'reframe.log',
'level': 'debug',
'format': '[%(asctime)s] %(levelname)s: %(check_info)s: %(message)s', # noqa: E501
'append': True,
'timestamp': "%Y%m%d_%H%M%S",
},
],
'handlers_perflog': [
{
'type': 'filelog',
'prefix': f'{log_prefix}/%(check_system)s/%(check_partition)s',
'level': 'info',
'format': (
'%(check_job_completion_time)s|reframe %(version)s|'
'%(check_info)s|jobid=%(check_jobid)s|'
'%(check_perf_var)s=%(check_perf_value)s|'
'ref=%(check_perf_ref)s '
'(l=%(check_perf_lower_thres)s, '
'u=%(check_perf_upper_thres)s)|'
'%(check_perf_unit)s'
),
'append': True
}
]
}
],
'logging': common_logging_config(reframe_prefix),
'general': [
{
# Enable automatic detection of CPU architecture for each partition
# See https://reframe-hpc.readthedocs.io/en/stable/configure.html#auto-detecting-processor-information
'remote_detect': True,
}
],
Expand All @@ -170,13 +128,7 @@
# Add default things to each partition:
partition_defaults = {
'scheduler': 'squeue',
# mpirun causes problems with cpu autodetect, since there is no system mpirun.
# See https://github.com/EESSI/test-suite/pull/53#issuecomment-1590849226
# and this feature request https://github.com/reframe-hpc/reframe/issues/2926
# However, using srun requires either using pmix or proper pmi2 integration in the MPI library
# See https://github.com/EESSI/test-suite/pull/53#issuecomment-1598753968
# Thus, we use mpirun for now, and manually swap to srun if we want to autodetect CPUs...
'launcher': 'srun',
'launcher': 'mpirun',
'environs': ['default'],
'features': [
FEATURES['CPU']
Expand All @@ -191,5 +143,3 @@
for system in site_configuration['systems']:
for partition in system['partitions']:
partition.update(partition_defaults)


Loading

0 comments on commit 3a32f0e

Please sign in to comment.