Skip to content

Commit

Permalink
Merge branch 'main' into osu
Browse files Browse the repository at this point in the history
  • Loading branch information
Satish Kamath committed Dec 13, 2023
2 parents efe9198 + abfe896 commit 378c9e7
Show file tree
Hide file tree
Showing 21 changed files with 531 additions and 112 deletions.
51 changes: 51 additions & 0 deletions CI/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Setting up EESSI test suite CI

To set up regular runs for the EESSI test suite on a system, four things are needed:

1. The variable `EESSI_CI_SYSTEM_NAME` needs to be set in the environment
2. A local checkout of the `CI` subdirectory of the EESSI test suite repository needs to be present
3. The EESSI test suite repository needs to contain a file `CI/${EESSI_CI_SYSTEM_NAME}/ci_config.sh` with the configuration for the CI on that system
4. Add running the `run_reframe_wrapper.sh` to your `crontab`

## Checking out the CI folder from the EESSI test-suite
You can clone the full EESSI test suite
```
git clone https://github.com/EESSI/test-suite.git
```
Or do a sparse checkout
```
git clone -n --depth=1 --filter=tree:0 https://github.com/EESSI/test-suite.git
cd test-suite
git sparse-checkout set --no-cone CI
git checkout
```

## Creating a CI configuration file
If you are adding CI on a new system, first, pick a name for that system (we'll refer to this as `EESSI_CI_SYSTEM_NAME`). The CI config should then be in `CI/${EESSI_CI_SYSTEM_NAME}/ci_config.sh`. You can use the example in `CI/aws_mc/ci_config.sh`, and adapt it to your needs.
It should define:
- `TEMPDIR` (optional): the temporary directory in which the CI pipeline can check out repositories and install ReFrame. Default: `$(mktemp --directory --tmpdir=/tmp -t rfm.XXXXXXXXXX)`.
- `REFRAME_ARGS` (optional): additional arguments to pass to the `reframe` command. Typically, you'll use this to specify `--tag` arguments to run a subset of tests. Default: `"--tag CI --tag 1_node"`.
- `REFRAME_VERSION` (mandatory): the version of ReFrame you'd like to use to drive the EESSI test suite in the CI pipeline.
- `REFRAME_URL` (optional): the URL that will be used to `git clone` the ReFrame repository (in order to provide the `hpctestlib`). Typically this points to the official repository, but you may want to use another URL from a fork for development purposes. Default: `https://github.com/reframe-hpc/reframe.git`.
- `REFRAME_BRANCH` (optional): the branch name to be cloned for the ReFrame repository (in order to provide the `hpctestlib`). Typically this points to the branch corresponding with `${REFRAME_VERSION}`, unless you want to run from a feature branch for development purposes. Default: `v${REFRAME_VERSION}`.
- `EESSI_VERSION` (mandatory): the version of the EESSI software stack you would like to be loaded & tested in the CI pipeline.
- `EESSI_TESTSUITE_URL` (optional): the URL that will be used to `git clone` the `EESSI/test-suite` repository. Typically this points to the official repository, but you may want to use another URL from a fork for development purposes. Default: `https://github.com/EESSI/test-suite.git`.
- `EESSI_TESTSUITE_VERSION` (optional): the version of the EESSI test-suite repository you want to use in the CI pipeline. Default: latest release.
- `RFM_CONFIG_FILES` (optional): the location of the ReFrame configuration file to be used for this system. Default: `${TEMPDIR}/test-suite/config/${EESSI_CI_SYSTEM_NAME}.py`.
- `RFM_CHECK_SEARCH_PATH` (optional): the search path where ReFrame should search for tests to run in this CI pipeline. Default: `${TEMPDIR}/test-suite/eessi/testsuite/tests/`.
- `RFM_CHECK_SEARCH_RECURSIVE` (optional): whether ReFrame should search `RFM_CHECK_SEARCH_PATH` recursively. Default: `1`.
- `RFM_PREFIX` (optional): the prefix in which ReFrame stores all the files. Default: `${HOME}/reframe_CI_runs`.

## Creating the `crontab` entry and specifying `EESSI_CI_SYSTEM_NAME`
This line depends on how often you want to run the tests, and where the `run_reframe_wrapper.sh` is located exactly. We also define the EESSI_CI_SYSTEM_NAME in this entry, as cronjobs don't normally read your `.bashrc` (and thus we need a different way of specifying this environment variable).
Assuming you checked out the EESSI test suite repository in your home dir:
```
echo "0 0 * * SUN EESSI_CI_SYSTEM_NAME=aws_citc ${HOME}/test-suite/CI/run_reframe_wrapper.sh" | crontab -
```
Would create a cronjob running weekly on Sundays. See the crontab manual for other schedules.

## Output of the CI pipeline
The whole point of the `run_reframe_wrapper.sh` script is to easily get the stdout and stderr from your `run_reframe.sh` in a time-stamped logfile. By default, these are stored in `${HOME}/EESSI_CI_LOGS`. This can be changed by setting the environment variable `EESSI_CI_LOGDIR`. Again, you'd have to set this when creating your `crontab` file, e.g.
```
echo "0 0 * * SUN EESSI_CI_SYSTEM_NAME=aws_citc EESSI_CI_LOGDIR=${HOME}/my_custom_logdir ${HOME}/test-suite/CI/run_reframe_wrapper.sh" | crontab -
```
3 changes: 3 additions & 0 deletions CI/aws_citc/ci_config.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Configurable items
REFRAME_ARGS="--tag CI --tag 1_node|2_nodes"
REFRAME_VERSION=4.4.1 # ReFrame version that will be pip-installed to drive the test suite
7 changes: 7 additions & 0 deletions CI/aws_mc/ci_config.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Configurable items
REFRAME_ARGS="--tag CI --tag 1_node|2_nodes"
REFRAME_VERSION=4.4.1 # ReFrame version that will be pip-installed to drive the test suite
# Latest release does not contain the `aws_mc.py` ReFrame config yet
# The custom EESSI_TESTSUITE_URL and EESSI_TESTSUITE_BRANCH can be removed in a follow-up PR
EESSI_TESTSUITE_URL='https://github.com/casparvl/test-suite.git'
EESSI_TESTSUITE_BRANCH='CI'
7 changes: 7 additions & 0 deletions CI/it4i_karolina/ci_config.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Configurable items
REFRAME_ARGS="--tag CI --tag 1_node|2_nodes"
REFRAME_VERSION=4.4.1 # ReFrame version that will be pip-installed to drive the test suite
# Latest release does not contain the `aws_mc.py` ReFrame config yet
# The custom EESSI_TESTSUITE_URL and EESSI_TESTSUITE_BRANCH can be removed in a follow-up PR
EESSI_TESTSUITE_URL='https://github.com/casparvl/test-suite.git'
EESSI_TESTSUITE_BRANCH='CI'
3 changes: 3 additions & 0 deletions CI/izum_vega/ci_config.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Configurable items
REFRAME_ARGS="--tag CI --tag 1_node|2_nodes"
REFRAME_VERSION=4.4.1 # ReFrame version that will be pip-installed to drive the test suite
117 changes: 117 additions & 0 deletions CI/run_reframe.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
#!/bin/bash
# Author: Caspar van Leeuwen
# Description: This script can be used to do regular runs of the ReFrame test suite, e.g. from a cronjob.
# Setup instructions: make sure you have your github access key configured in your .ssh/config
# i.e. configure an entry with HostName github.com and IdentityFile pointing to the ssh key registered with Github

# Get directory of the current script
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )

# Check if EESSI_CI_SYSTEM_NAME is defined
if [ -z "${EESSI_CI_SYSTEM_NAME}" ]; then
echo "You have to define the EESSI_CI_SYSTEM_NAME environment variable in order to run the EESSI test suite CI" > /dev/stderr
exit 1
fi

# Check if CI_CONFIG file file exists
CI_CONFIG="${SCRIPT_DIR}/${EESSI_CI_SYSTEM_NAME}/ci_config.sh"
if [ ! -f "${CI_CONFIG}" ]; then
echo "File ${CI_CONFIG} does not exist. Please check your RFM_CI_SYSTEM_NAME (${EESSI_CI_SYSTEM_NAME}) and make sure the directory in which the current script resides (${SCRIPT_DIR}) contains a subdirectory with that name, and a CI configuration file (ci_config.sh) inside". > /dev/stderr
exit 1
fi

# Set the CI configuration for this system
source "${CI_CONFIG}"

# Set default configuration
if [ -z "${TEMPDIR}" ]; then
TEMPDIR=$(mktemp --directory --tmpdir=/tmp -t rfm.XXXXXXXXXX)
fi
if [ -z "${REFRAME_ARGS}" ]; then
REFRAME_ARGS="--tag CI --tag 1_node"
fi
if [ -z "${REFRAME_URL}" ]; then
REFRAME_URL='https://github.com/reframe-hpc/reframe.git'
fi
if [ -z "${REFRAME_BRANCH}" ]; then
REFRAME_BRANCH="v${REFRAME_VERSION}"
fi
if [ -z "${EESSI_TESTSUITE_URL}" ]; then
EESSI_TESTSUITE_URL='https://github.com/EESSI/test-suite.git'
fi
if [ -z "${EESSI_TESTSUITE_BRANCH}" ]; then
EESSI_TESTSUITE_BRANCH='v0.1.0'
fi
if [ -z "${EESSI_VERSION}" ]; then
EESSI_VERSION='latest'
fi
if [ -z "${RFM_CONFIG_FILES}" ]; then
export RFM_CONFIG_FILES="${TEMPDIR}/test-suite/config/${EESSI_CI_SYSTEM_NAME}.py"
fi
if [ -z "${RFM_CHECK_SEARCH_PATH}" ]; then
export RFM_CHECK_SEARCH_PATH="${TEMPDIR}/test-suite/eessi/testsuite/tests/"
fi
if [ -z "${RFM_CHECK_SEARCH_RECURSIVE}" ]; then
export RFM_CHECK_SEARCH_RECURSIVE=1
fi
if [ -z "${RFM_PREFIX}" ]; then
export RFM_PREFIX="${HOME}/reframe_CI_runs"
fi

# Create virtualenv for ReFrame using system python
python3 -m venv "${TEMPDIR}"/reframe_venv
source "${TEMPDIR}"/reframe_venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install reframe-hpc=="${REFRAME_VERSION}"

# Clone reframe repo to have the hpctestlib:
git clone "${REFRAME_URL}" --branch "${REFRAME_BRANCH}" "${TEMPDIR}"/reframe
export PYTHONPATH="${PYTHONPATH}":"${TEMPDIR}"/reframe

# Clone test suite repo
git clone "${EESSI_TESTSUITE_URL}" --branch "${EESSI_TESTSUITE_BRANCH}" "${TEMPDIR}"/test-suite
export PYTHONPATH="${PYTHONPATH}":"${TEMPDIR}"/test-suite/

# Start the EESSI environment
unset MODULEPATH
if [ "${EESSI_VERSION}" = 'latest' ]; then
eessi_init_path=/cvmfs/pilot.eessi-hpc.org/latest/init/bash
else
eessi_init_path=/cvmfs/pilot.eessi-hpc.org/versions/"${EESSI_VERSION}"/init/bash
fi
source "${eessi_init_path}"

# Needed in order to make sure the reframe from our TEMPDIR is first on the PATH,
# prior to the one shipped with the 2021.12 compat layer
# Probably no longer needed with newer compat layer that doesn't include ReFrame
deactivate
source "${TEMPDIR}"/reframe_venv/bin/activate

# Print ReFrame config
echo "Starting CI run with the follwing settings:"
echo ""
echo "TEMPDIR: ${TEMPDIR}"
echo "PYTHONPATH: ${PYTHONPATH}"
echo "EESSI test suite URL: ${EESSI_TESTSUITE_URL}"
echo "EESSI test suite version: ${EESSI_TESTSUITE_VERSION}"
echo "HPCtestlib from ReFrame URL: ${REFRAME_URL}"
echo "HPCtestlib from ReFrame branch: ${REFRAME_BRANCH}"
echo "ReFrame executable: $(which reframe)"
echo "ReFrame version: $(reframe --version)"
echo "ReFrame config file: ${RFM_CONFIG_FILES}"
echo "ReFrame check search path: ${RFM_CHECK_SEARCH_PATH}"
echo "ReFrame check search recursive: ${RFM_CHECK_SEARCH_RECURSIVE}"
echo "ReFrame prefix: ${RFM_PREFIX}"
echo "ReFrame args: ${REFRAME_ARGS}"
echo ""

# List tests
echo "Listing tests:"
reframe ${REFRAME_ARGS} --list

# Run
echo "Run tests:"
reframe ${REFRAME_ARGS} --run

# Cleanup
rm -rf "${TEMPDIR}"
22 changes: 22 additions & 0 deletions CI/run_reframe_wrapper.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/bin/bash
# Author: Caspar van Leeuwen
# Description: wraps the run_reframe.sh script so that all stdout and stderr is easily be collected in a logfile
# which has a datestamp in the name.

# logfile
if [ ! -z ${EESSI_CI_LOGDIR} ]; then
LOGDIR=${EESSI_CI_LOGDIR}
else
LOGDIR=${HOME}/EESSI_CI_LOGS
fi
mkdir -p ${LOGDIR}

datestamp=$(date +%Y%m%d_%H%M%S)
LOGFILE=${LOGDIR}/rfm_${datestamp}.log
touch $LOGFILE

# Get directory of the current script
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )

# Execute run_reframe.sh, which should be in the same directory as the current script
${SCRIPT_DIR}/run_reframe.sh > $LOGFILE 2>&1
3 changes: 3 additions & 0 deletions CI/surf_snellius/ci_config.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Configurable items
REFRAME_ARGS="--tag CI --tag 1_node|2_nodes"
REFRAME_VERSION=4.4.1 # ReFrame version that will be pip-installed to drive the test suite
4 changes: 2 additions & 2 deletions config/aws_citc.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

import os

from eessi.testsuite.common_config import common_logging_config
from eessi.testsuite.common_config import common_logging_config, common_eessi_init
from eessi.testsuite.constants import FEATURES

# This config will write all staging, output and logging to subdirs under this prefix
Expand Down Expand Up @@ -224,7 +224,7 @@
FEATURES['CPU']
],
'prepare_cmds': [
'source /cvmfs/pilot.eessi-hpc.org/latest/init/bash',
'source %s' % common_eessi_init(),
# Required when using srun as launcher with --export=NONE in partition access, in order to ensure job
# steps inherit environment. It doesn't hurt to define this even if srun is not used
'export SLURM_EXPORT_ENV=ALL'
Expand Down
110 changes: 110 additions & 0 deletions config/aws_mc.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# WARNING: for CPU autodetect to work correctly you need to
# 1. Either use ReFrame >= 4.3.3 or temporarily change the 'launcher' for each partition to srun
# 2. Either use ReFrame >= 4.3.3 or run from a clone of the ReFrame repository

# Without this, the autodetect job fails because
# 1. A missing mpirun command
# 2. An incorrect directory structure is assumed when preparing the stagedir for the autodetect job

# Related issues
# 1. https://github.com/reframe-hpc/reframe/issues/2926
# 2. https://github.com/reframe-hpc/reframe/issues/2914

import os

from eessi.testsuite.common_config import common_logging_config, common_eessi_init
from eessi.testsuite.constants import FEATURES

# This config will write all staging, output and logging to subdirs under this prefix
# Override with RFM_PREFIX environment variable
reframe_prefix = os.path.join(os.environ['HOME'], 'reframe_runs')

# AWS CITC site configuration
site_configuration = {
'systems': [
{
'name': 'Magic_Castle',
'descr': 'Magic Castle build and test environment on AWS',
'modules_system': 'lmod',
'hostnames': ['login*', '*-node'],
'prefix': reframe_prefix,
'partitions': [
{
'name': 'x86_64-generic-16c-30gb',
'access': ['--partition=x86-64-generic-node', '--export=NONE'],
'descr': 'Generic (Haswell), 16 cores, 30 GB',
},
{
'name': 'x86_64-haswell-16c-30gb',
'access': ['--partition=x86-64-intel-haswell-node', '--export=NONE'],
'descr': 'Haswell, 16 cores, 30 GB',
},
{
'name': 'x86_64-skylake-16c-30gb',
'access': ['--partition=x86-64-intel-skylake-node', '--export=NONE'],
'descr': 'Skylake, 16 cores, 30 GB',
},
{
'name': 'x86_64-zen2-16c-30gb',
'access': ['--partition=x86-64-amd-zen2-node', '--export=NONE'],
'descr': 'Zen2, 16 cores, 30 GB',
},
{
'name': 'x86_64-zen3-16c-30gb',
'access': ['--partition=x86-64-amd-zen3-node', '--export=NONE'],
'descr': 'Zen3, 16 cores, 30 GiB',
},
{
'name': 'aarch64-generic-16c-32gb',
'access': ['--partition=aarch64-generic-node', '--export=NONE'],
'descr': 'Generic (Neoverse N1), 16 cores, 32 GB',
},
{
'name': 'aarch64-neoverse-V1-16c-32gb',
'access': ['--partition=aarch64-neoverse-v1-node', '--export=NONE'],
'descr': 'Neoverse V1, 16 cores, 32 GB',
},
{
'name': 'aarch64-neoverse-N1-16c-32gb',
'access': ['--partition=aarch64-neoverse-n1-node', '--export=NONE'],
'descr': 'Neoverse N1, 16 cores, 32 GiB',
},
]
},
],
'environments': [
{
'name': 'default',
'cc': 'cc',
'cxx': '',
'ftn': '',
},
],
'logging': common_logging_config(reframe_prefix),
'general': [
{
# Enable automatic detection of CPU architecture for each partition
# See https://reframe-hpc.readthedocs.io/en/stable/configure.html#auto-detecting-processor-information
'remote_detect': True,
}
],
}

# Add default things to each partition:
partition_defaults = {
'scheduler': 'slurm',
'launcher': 'mpirun',
'environs': ['default'],
'features': [
FEATURES['CPU']
],
'prepare_cmds': [
'source %s' % common_eessi_init(),
# Required when using srun as launcher with --export=NONE in partition access, in order to ensure job
# steps inherit environment. It doesn't hurt to define this even if srun is not used
'export SLURM_EXPORT_ENV=ALL'
],
}
for system in site_configuration['systems']:
for partition in system['partitions']:
partition.update(partition_defaults)
Loading

0 comments on commit 378c9e7

Please sign in to comment.