Apptainer (formerly known as Singularity) is a tool very specific to HPC. It allows the execution of docker containers in user space. This alleviates the concern of granting admin privileges to end users on a shared file system.
Singularity also comes with its own language to build a singularity container that is reasonably similar to what docker uses.
Singularity can either run singularity containers or docker containers. The latter it transforms into singularity on-the-fly.
The goal of this article is to
- inform the installation and configuration of singularity on a HPC cluster running SLURM as a scheduler
- Configure R Studio Workbench (RSW) to use singularity containers on the same HPC
- Everything related to RStudio Workbench and R runs in containers (docker and singularity)
- Look&feel of RStudio Workbench (almost) unchanged from a user perspective
- Utilisation of shared storage for singularity containers and
renv
cache
- Installation of Singularity
- Setup a SPANK plugin for deep integration of singularity into SLURM
- Build Singularity Containers for R Session based on the docker containers for r-session-complete
- Build Docker Container for RSW based on rstudio-workbench
- Simple tests for the new functionality
- Hints and suggestions on how to use Singularity and R for increased reproducibility
- reasonably up-to-date vand fully functional version of SLURM (Version 19+)
- (optional) application stack using environment modules or Lmod with base directory in
appstack-path
- persistent shared storage across the nodes (e.g. general NAS, NFS, GPFS, ...) to store the singularity images. Folder name will subsequently referred to as
container-path
- transient shared storage across the nodes (e.g. Lustre, GPFS, ...) for scratch storage, subsequently referred to as
scratch-path
- Using a docker container for RSW is not strictly needed - RSW can also be installed and configured natively
For the installation simply follow along the instructions.
If you plan to integrate it into your application stack, make sure you choose a prefix
that is compatible with your other applications in the stack and uses the same naming convention, e.g. appstack-path/singularity/3.8.5
for Singularity 3.8.5. A sample Lua Module is provided for conveniency.
SLURM is a popular HPC scheduler that supports SPANK plugins. SPANK stands for Slurm Plug-in Architecture for Node and job Kontrol. For the work considered here a new SPANK plugin is created that that will allow a deep integration of singularity into the HPC.
While strictly not necessary, it will simplify the usage of singularity significantly for the end users.
Instead of using a submit script for each singularity run like
#!/bin/bash
ml load singularity/3.8.5
singularity run R-container.sif Rscript myCode.R
they can run straight
#!/path/to/Rscript
#SBATCH --singularity-container my-R-container.sif
<R Code>
i.e. add the SBATCH line above and other resource requirements to their R Code and submit this without the need of knowing all the details of the singularity implementation (/path/to/Rscript
needs to resemble the path within the container.
RStudio is not the first company that uses SPANK plugin for singularity integration. Many other Supercomputing Centers around the world have implemented such a plugin.
We are therefore using an implementation from GSI that we extended further to make it even more flexible.
Further details with up-to-date information can be found in slurm-singularity-exec.
In order to install and configure the SPANK plugin for singularity specifically for our use case, please use the plugin in the subfolder slurm-singularity-exec. Before building and installing, please
-
replace in
singularity-exec-conf.tmpl
/efs/singularity/containers
bycontainer-path
/efs
by any storage path you want to have available within the container (if not necessary, please remove/efs
)/scratch
byscratch-path
- the remaining options should remain unchanged. The
path=
variables will create the bind mounts for the container:/sys
for cgroups support/var/run/munge
,/etc/munge
and/run/munge
for munge support/var/spool/slurmd
to allow submitting jobs from within the container
-
replace in
slurm-singularity-wrapper.sh
/efs/singularity/3.8.5/bin/
with the full path to the singularity binary or the appropriateml
/module
commands to load the module
Once done, simply run (with admin rights).
make install
Please note: Any of the above modifications can be done later on as well.
Once the plugin is installed, please restart slurmctld
via
systemctl restart slurmctld
First let us build a singularity image from a docker container, e.g. from CentOS 8:
singularity build centos8.img docker://centos:8
We now can run this command via singularity
singularity run centos8.img cat /etc/centos-release
which should show us that we are indeed running in CentOS 8.
To test the SPANK Plugin for singularity now we can run
srun --pty --singularity-container-path=`pwd` --singularity-container centos8.img bash
Singularity> cat /etc/centos-release
If the above steps work, then the plugin is good to go for the next step.
- reuse as much as possible, that is why we will use containers from r-session-complete
- only add as much as needed but also enough to make the use of the containers straightforward and seamless
- add some packages and configuration specific for HPC (e.g munge, zeromq as a pre-req for clustermq)
- add renv to avoid the chicken-and-egg problem, i.e. to have renv installed in addition to all the other Base R packages
- configure renv to use a global package cache and add OS/linux-distro specific additional level in the directory structure
- add Java integration to the installed version of R since
rJava
is a problematic R package - setup binary repositories for CRAN and BioConductor from public RSPM
- for CentOS 7 add devtoolset-10 to allow for more recent compiler toolchain.
Appropriate singularity recipes can be found for CentOS7 and Ubuntu 18.04 / Bionic. They have ample comments to help you decide which bits to keep and which to discard.
They can be built by running (using admin privileges)
singularity build r-session-complete.sif r-session-complete.sdef
Please note that this can be a very time-consuming process. Ensure that your temporary folder (e.g. /tmp
or wherever the environment variable TMP
/TMPDIR
etc. points to) has sufficient amounts of disk space available. You will definitely need around 4 GB of disk space. A benefit of singularity containers is that they are much smaller (<50 % of docker image size) but they take a while to build.
Also make sure you set the 'SLURM_VERSION' variable to the same version than your HPC cluster is using.
It is mandatory to set the environment variable RSW_LICENSE
to point to a valid license key. In addition, the docker container will be built for SLURM 19.05.2 and RStudio Workbench 2021.09.1-372.pro1 by default. Those defaults can be changed by defining the environment variables SLURM_VERSION
and RSW_VERSION
, respectively.
- Change into the directory
data/workbench
of this repository - Make sure the
launcher-sessions-callback-address
inetc/rstudio/rserver.conf
is set to an URL that is reachable from the compute nodes. - Create a directory
munge
and copy your munge.key into that folder. Change ownership to user and group 111 (e.g.chown 111:111 munge/munge.key
). - Run (using admin privileges)
docker-compose build
- You also may want to
- push the new image to your docker registry
- configure your authentication mechanism in the docker container
- review in docker-compose.yml the bind mounts (e.g. /efs) to ensure that essential file systems (/home, ...) are mounted into the cntainer.
docker-compose up -d
- Browsing to
http://<hostname of docker server>:8787
should now present the RSW login screen. (by default it has two users,rstudio/rstudio
andmm/test123
) - Once logged in you then can select between local and SLURM launcher and run your R session.
The singularity integration of the RSW ui is done in launcher.slurm.conf
. There you will find the line
constraints=Container=singularity-container
which will activate a new element in the web UI where users can specify the respectivee image they want to load. The slurm launcher will then appen the option --singularity-container
with the value specified in this field to the sbatch command that will spawn the session.
Thanks to setting up good defaults in the SPANK plugin (--singularity-container-path|path
, --singularity-bind|bind
) the user only needs to worry about the container name - even that is then being cached once typed in.
- With the current implementation, the slurm launcher will produce warning messages "Failed to get job metadata". This is due to the implementation of the launcher that expects the job metadata at the start of the slurm standard output file. With the SPANK plugin however the first line in standard output is "Start singularity container...". Customers that would like to get rid of this messages, need to comment out line 43 of
slurm-singularity-wrapper.sh
- Start time of the Singularity R Sessions can be a little bit longer compared to native sessions. This is mostly due to the load time of the singularity container.
renv is a R package that is used for R package management. It enables the reproducible usage of R packages.
renv maintains a project specific renv.lock
file where all the metadata (packages, versions, repository information) is stored. When using a version-controlled workflow, this file needs to be stored in the source code repository. Any other file or directory (e.g. renv subfolder) can be considered transient and does not need to be added to version control.
In the case of using git
it is advisable to create file .gitignore
in the root folder of the project and add the line
renv
into that file.
renv::init()
will initialize a project for the use of renv. It will check the R code files in the current directory and detect any needed package, check the renv cache if the package is there in the version it can download it from the defined repositories. If it is not there, it will install the same into the local subfolder (renv
). With the exception of renv package itself any R package will then be moved to a cache and a symbolic link created to its original location. If the package is already in the cache in the requested version, a simple symlink will be created.
The advantage of this is that once a R package is in the cache, subsequent installations of commonly used R packages will be much faster
By default the package cache is created in each user's home-directory (~/.local/share/renv
). This can be changed by defining RENV_PATHS_CACHE
in Renviron.site
of the R installation. The variable should point to a common folder with appropriate write permissions for everyone.
On systems where there is the use of multiple operating systems and linux distributions, setting
RENV_PATHS_PREFIX_AUTO = TRUE
can be useful - the cache directory structure will then contain an extra directory level named according to the OS used.
For r-session-complete
we set
RENV_PATHS_PREFIX_AUTO = TRUE
RENV_PATHS_CACHE=/scratch/renv
to create a global package cache shared by users and across nodes.
If you want to use such a functionality, please make sure you are setting the appropriate ACL's and ensure that those are replicated further downstream
A very open ACL for the packge cache would be
# file: scratch/renv/
# owner: root
# group: root
user::rwx
group::rwx
mask::rwx
other::rwx
default:user::rwx
default:group::rwx
default:mask::rwx
default:other::rwx
If you would like to run the code of a colleague that uses renv
in his work, you need to run renv::restore()
in the root-folder of the project. This command will setup the environment and retrieve all the R packages as defined in renv.lock
During the code development, new packages will be needed. In order to stay in sync with the renv.lock file, it is advisable to run renv::snapshot()
from time to time and check-in the changes in renv.lock
together with the code commits.
renv
and package installation in general can be sped up using binary packages, either served from CRAN or from RStudio Package Manager.
As of renv
0.15.1, parallel installation of R packages is supported via the uise of pak
. This can be activated by setting
options(renv.config.pak.enabled = TRUE)