-
Notifications
You must be signed in to change notification settings - Fork 33
Slurm snippets
Array jobs are quite handy for automating large batches.
Maybe you want to yet post-process the output from array jobs, e.g. after generating frames for an animation via an array job, you would like to call ffmpeg to make a whole movie out of the frames. Slurm allows for job dependencies: you may queue a job that launches only after some conditions on previous jobs have been met. For example, the below snippet goes into the array job script, and it launches a single instance of do_whatever.sh
after all the array job tasks complete.
if [ $SLURM_ARRAY_TASK_ID -eq $SLURM_ARRAY_TASK_MIN ]
then
echo "setting a dependent job for task $SLURM_ARRAY_JOB_ID"
sbatch --dependency=afterany:$SLURM_ARRAY_JOB_ID do_whatever.sh
fi
As an example, this script sets up an array job that launches about two hundred jobs, each generating a frame from a single file in VisIt. One of the tasks (the one with smallest array id) launches the movie-making script after all the frames are finished. NB: use of environment variables to pass data (including task indices used to select input files) to the python script used by VisIt.
#!/bin/bash -l
#SBATCH -t 2:0:0
#SBATCH -J visit_script
#SBATCH -M carrington
#SBATCH --nodes 1
#SBATCH --ntasks 64
#SBATCH --ntasks-per-node 64
#SBATCH --array=0-206:1
#SBATCH --no-requeue
#SBATCH --mem=120G
CHUNKSIZE=$SLURM_ARRAY_TASK_STEP
export visit_render_start=$SLURM_ARRAY_TASK_ID
export visit_render_end=$((visit_render_start+CHUNKSIZE-1))
export visit_outfn="EGI_coolmess_MGA"
export visit_outfolder="/proj/mjalho/visit_scripts/${visit_outfn}"
mkdir "/proj/mjalho/visit_scripts/${visit_outfn}"
export visit_dbstring="/wrk-vakka/group/spacephysics/vlasiator/3D/EGI/visualizations/ballooning/jlsidecar_bulk1.*.vlsv database"
export overlay_img="/home/mjalho/proj/visit_scripts/sheet_hotmess_ticks.png"
source /proj/group/spacephysics/visit/carrington-turso03/modules-new.sh
module load ImageMagick
/proj/group/spacephysics/visit/carrington-turso03/bin/visit \
-np 64 -l mpirun -la \
"--oversubscribe -mca pml ucx --mca btl ^vader,tcp,openib -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_TLS=rc,sm -x UCX_IB_ADDR_TYPE=ib_global"\
-nowin -cli -s EGI_coolmess_MGA.py
if [ $SLURM_ARRAY_TASK_ID -eq $SLURM_ARRAY_TASK_MIN ]
then
echo "setting a dependent job for task $SLURM_ARRAY_JOB_ID"
sbatch --dependency=afterany:$SLURM_ARRAY_JOB_ID job_EGI_hotmess_movie.sh $visit_outfn
fi