SLURM job scheduler #62

suujia · 2018-07-19T21:48:28Z

The most commonly used job scheduler with Singularity or bioinformatics analysis jobs is Slurm. Here are its benefits:

load balancing between multiple nodes
limiting ram/process job size
limiting job time

Implementation -- singularity_job.slurm

#!/bin/bash 
#SBATCH --ntasks=1     # number of tasks/processes run in this job 
#SBATCH --time=01:00:00     # one hour runtime limit 
#SBATCH --mem-per-cpu=4096   # default is 1GB, max memory required per CPU
#SBATCH --nodes=2
#SBATCH --output=slurm.out
#SBATCH --cpus-per-task=1
#SBATCH --requeue  # if allocated node hangs, job requeued 
#SBATCH --gres=lscratch:500  # max scratch disk space available for use in /lscratch, in GB   -- when job is terminated, all data in /lscratch/$SLURM_JOB_ID directory will be automatically deleted (if data needs to be saved, copy to /data before job concludes 

CONTAINER=/opt/singularity/orca.simg
OVERLAY=/$HOME/.orca/overlay.simg

module load singularity
# hostname
singularity build ${CONTAINER} ${OVERLAY}    # run program/command on resource allocated

Submit the job

$ sbatch singularity_job.slurm
$ cat slurm-<jobid>.out
$ squeue

Etc

mkdir lscratch
export SINGULARITY_BINDPATH="/data/$USER:/data,/fdb:/resources,/lscratch/$SLURM_JOB_ID:/tmp"
singularity shell my-container.simg

"we use symbolic links to refer to our network storage systems. As a result, you will need to bind some additional directories to the ones you know of and use directly to ensure that the symbolic link destinations are also bound into the container." hpc nih

Changing TMPDIR from /tmp (8GB per node) to /lscratch (set your own limit)

this also is better for multi user accesses
allocate local scratch disk for jobs

#!/bin/bash
export TMPDIR=/lscratch/$SLURM_JOB_ID

The text was updated successfully, but these errors were encountered:

suujia · 2018-07-31T20:23:15Z

squeue - shows all jobs currently running in cluster
squeue -u username - shows your jobs that are running
state: idle - shows how many nodes are available
srun - launch interactive jobs
sbatch __.slurm - launch batch scripts to be queued up for work
scancel JOBID - cancel job that is currently running (hard termination)
scontrol show job JOBID - lots of information on job

other

MPI: is for clusters of computers working together, with each computer processing a subset of the problem. It also allows for efficient transfer of data between the nodes.

This might not entirely be necessary for our use case, perhaps the batch scheduler, Slurm, is good enough for us. The file can be run and the users

queue up jobs
access compute nodes directly

Partition:
Accounts:

suujia · 2018-07-31T22:04:53Z

Slurm Singularity Plugin

Slurm provides host machine resource/workload management.
– Singularity provides the software environment.
Integration goal
– Manage Singularity container images.
– Generate required Singularity command from new srun options.

Using the plugin

srun [--sgimage=orca.simg [--sgenv=<env script>]
[--sgdebug]] [<other srun options>] <executable>

rsync - copy/sync
spank plugin - Configured in the spank plugstack using sg_spank.conf
SWRepo - server node repo for Singularity container images

suujia · 2018-08-14T23:35:14Z

$srun -N 2 --ntasks-per-node=8 --pty bash
this requests 2 nodes (-N 2) and we are saying we are going to launch a maximum of 8 tasks per 
node (--ntasks-per-node=8). We are saying that you want to run a login shell (bash) on the compute 
nodes. The option --pty is important. This gives a login prompt and a session that looks very much
like a normal interactive session but it is on one of the compute nodes. If you forget the --pty you 
will not get a login prompt and every command you enter will get run 16 = (-N 2) x (--ntasks-per-
node=8) times.

I believe slurm is still worth looking into if you are hoping for any load balancing due to high volume of jobs. There are some resources saying you can launch an interactive job. Singularity also has a built in SPANK plugin that can run slurm jobs inside of singularity containers.

Not sure how long the average job is for ORCA users but this is an option! This also allows users to be able to check on how long their job has been running, limit size and running time, etc. Also, does running it interactively with our system currently mean they must keep the program open? @sjackman

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SLURM job scheduler #62

SLURM job scheduler #62

suujia commented Jul 19, 2018 •

edited

Loading

suujia commented Jul 31, 2018 •

edited

Loading

suujia commented Jul 31, 2018 •

edited

Loading

suujia commented Aug 14, 2018

SLURM job scheduler #62

SLURM job scheduler #62

Comments

suujia commented Jul 19, 2018 • edited Loading

Implementation -- singularity_job.slurm

Submit the job

Etc

Changing TMPDIR from /tmp (8GB per node) to /lscratch (set your own limit)

suujia commented Jul 31, 2018 • edited Loading

other

suujia commented Jul 31, 2018 • edited Loading

Using the plugin

suujia commented Aug 14, 2018

suujia commented Jul 19, 2018 •

edited

Loading

suujia commented Jul 31, 2018 •

edited

Loading

suujia commented Jul 31, 2018 •

edited

Loading