diff --git a/_episodes/06-extra-software.md b/_episodes/06-extra-software.md index f8c2c4c..b3b1c5c 100644 --- a/_episodes/06-extra-software.md +++ b/_episodes/06-extra-software.md @@ -15,16 +15,58 @@ keypoints: ## GPU acceleration -LAMMPS has the capability to use GPUs to accelerate the calculations needed to run a simulation, but the program needs to be compiled with the correct parameters for this option to be available. +LAMMPS has the capability to use GPUs to accelerate the calculations needed to run a simulation, +but the program needs to be compiled with the correct parameters for this option to be available. Furthermore, LAMMPS can exploit multiple GPUs on the same system, although the performance scaling depends heavily on the particular system. As always, we recommend that each user should run benchmarks for their particular use-case to ensure that they are getting performance benefits. While not every LAMMPS force field or fix is available for GPU, a vast majority are, and more are added with each new version. Check the LAMMPS documentation for GPU compatibility with a specific command. -To use the GPU accelerated commands, you will need to be an extra flag when calling the LAMMPS binary: `-pk gpu `. +There are two main LAMMPS packages that add GPU acceleration: `GPU` and `KOKKOS`. +The differences are many, including which `fixes` and `computes` are implemented for each package (as always, consult the manual), +but the main difference is the underlying framework used in each. +The flags needed for each are also different, we have some examples of each below. + +ARCHER2 has the `lammps-gpu` module, which has lammps compiled with KOKKOS. +Due to the model of AMD GPUs and compiler/driver versions available, it hasn't been possible to compile LAMMPS with the `GPU` package. +Cirrus, a Tier2 HPC system also hosted at EPCC, has a `lammps-gpu` module installed with the `GPU` package. + +### KOKKOS package + +In `exercises/4-gpu-simulation` you can find the Lennard-Jones input file used in exercise 2, with a larger number of atoms, +and a slurm script that loads the `lammps-gpu` module, and runs the simulation on a GPU. + +The main differences are some of the `#SBATCH` lines, where we now have to set the number of GPUs to request, and use a different partition/qos combo: + +```bash +... +#SBATCH --nodes=1 +#SBATCH --gpus=1 +... +#SBATCH --partition=gpu +#SBATCH --qos=gpu-shd + +``` + +and the `srun` line, where we have to set the number of tasks, CPUS, the hints, and the distribution (slurm won't allow this on the `#SBATCH` lines), +as well as adding the `KOKKOS`-specific flags `-k on g 1 -pk kokkos -sf kk` + + +```bash +srun --ntasks=1 --cpus-per-task=1 --hint=nomultithread --distribution=block:block \ +lmp -k on g 1 -pk kokkos -sf kk -i in.lj_exercise -l log_gpu.$SLURM_JOB_ID +``` + +The `-k on g 1` flag tells `KOKKOS` to use 1 GPU, and `-sf kk` adds the `\kk` suffic to all styles that support it. + + +### GPU package + +To use the GPU accelerated commands, you would need to be an extra flag when calling the LAMMPS binary: `-pk gpu `. You will also need to add the `\gpu` suffix to all the styles intended to be accelerated this way or, alternatively, you can use the `-sf gpu` flag to append the `\gpu` suffix to all styles that support it (though this is at your own risk). -So, for example, if ARCHER2 had GPUs, you would change the `srun` line from: + +For example, if you were to run exercise 4 on Cirrus, you could change the line: ``` srun lmp -i in.ethanol -l log.$SLURM_JOB_ID diff --git a/exercises/1-performance-exercise/sub.slurm b/exercises/1-performance-exercise/sub.slurm index e8b6670..bd0f403 100644 --- a/exercises/1-performance-exercise/sub.slurm +++ b/exercises/1-performance-exercise/sub.slurm @@ -3,7 +3,7 @@ # Slurm job options (name, number of compute nodes, job time) #SBATCH --job-name=lmp_ex1 #SBATCH --nodes=1 -#SBATCH --time=0:20:0 +#SBATCH --time=0:10:0 #SBATCH --hint=nomultithread #SBATCH --distribution=block:block #SBATCH --tasks-per-node=64 diff --git a/exercises/2-lj-exercise/sub.slurm b/exercises/2-lj-exercise/sub.slurm index 1698268..b3c9d62 100644 --- a/exercises/2-lj-exercise/sub.slurm +++ b/exercises/2-lj-exercise/sub.slurm @@ -3,7 +3,7 @@ # Slurm job options (name, number of compute nodes, job time) #SBATCH --job-name=lmp_ex2 #SBATCH --nodes=1 -#SBATCH --time=0:20:0 +#SBATCH --time=0:10:0 #SBATCH --hint=nomultithread #SBATCH --distribution=block:block #SBATCH --tasks-per-node=128 diff --git a/exercises/3-advanced-inputs-exercise/sub.slurm b/exercises/3-advanced-inputs-exercise/sub.slurm index 97c5ec4..e51636c 100644 --- a/exercises/3-advanced-inputs-exercise/sub.slurm +++ b/exercises/3-advanced-inputs-exercise/sub.slurm @@ -3,7 +3,7 @@ # Slurm job options (name, number of compute nodes, job time) #SBATCH --job-name=lmp_ex3 #SBATCH --nodes=1 -#SBATCH --time=0:20:0 +#SBATCH --time=0:10:0 #SBATCH --hint=nomultithread #SBATCH --distribution=block:block #SBATCH --tasks-per-node=128 diff --git a/exercises/4-gpu-simulation/in.lj_exercise b/exercises/4-gpu-simulation/in.lj_exercise new file mode 100644 index 0000000..9f25faa --- /dev/null +++ b/exercises/4-gpu-simulation/in.lj_exercise @@ -0,0 +1,85 @@ +#################################### +# Example LAMMPS input script # +# for a simple Lennard Jones fluid # +#################################### + +#################################### +# 0) Define vairables +#################################### + +variable DENSITY equal 0.8 + +#################################### +# 1) Set up simulation box +# - We set a 3D periodic box +# - Our box has 10x10x10 atom +# positions, evenly distributed +# - The atom starting sites are +# separated such that the box density +# is 0.6 +#################################### + +units lj +atom_style atomic +dimension 3 +boundary p p p + +lattice sc ${DENSITY} +region box block 0 50 0 50 0 50 +create_box 1 box +create_atoms 1 box + +#################################### +# 2) Define interparticle interactions +# - Here, we use truncated & shifted LJ +# - All atoms of type 1 (in this case, all atoms) +# have a mass of 1.0 +#################################### + +pair_style lj/cut 3.5 +pair_modify shift yes +pair_coeff 1 1 1.0 1.0 +mass 1 1.0 + +#################################### +# 3) Neighbour lists +# - Each atom will only consider neighbours +# within a distance of 2.8 of each other +# - The neighbour lists are recalculated +# every timestep +#################################### + +neighbor 0.3 bin +neigh_modify delay 10 every 1 + +#################################### +# 4) Define simulation parameters +# - We fix the temperature and +# linear and angular momenta +# of the system +# - We run with fixed number (n), +# volume (v), temperature (t) +#################################### + +fix LinMom all momentum 50 linear 1 1 1 angular +fix 1 all nvt temp 1.00 1.00 5.0 +#fix 1 all npt temp 1.0 1.0 25.0 iso 1.5150 1.5150 10.0 + +#################################### +# 5) Final setup +# - Define starting particle velocity +# - Define timestep +# - Define output system properties (temp, energy, etc.) +# - Define simulation length +#################################### + +velocity all create 1.0 199085 mom no + +timestep 0.005 + +thermo_style custom step temp etotal pe ke press vol density +thermo 500 + +run_style verlet + +run 50000 diff --git a/exercises/4-gpu-simulation/sub.slurm b/exercises/4-gpu-simulation/sub.slurm new file mode 100644 index 0000000..6367aa3 --- /dev/null +++ b/exercises/4-gpu-simulation/sub.slurm @@ -0,0 +1,26 @@ +#!/bin/bash + +# Slurm job options (name, number of compute nodes, job time) +#SBATCH --job-name=lmp_ex4 +#SBATCH --nodes=1 +#SBATCH --gpus=1 +#SBATCH --time=0:05:0 + +# The budget code of the project +#SBATCH --account=ta176 +# Standard partition +#SBATCH --partition=gpu +# Short QoS since our runtime is under 20m +#SBATCH --qos=gpu-shd + +# load the lammps module +module load lammps-gpu + +# Set the number of threads to 1 +# This prevents any threaded system libraries from automatically +# using threading. +export OMP_NUM_THREADS=1 + +# Launch the parallel job +srun --ntasks=1 --cpus-per-task=1 --hint=nomultithread --distribution=block:block \ +lmp -k on g 1 -pk kokkos -sf kk -i in.lj_exercise -l log_gpu.$SLURM_JOB_ID diff --git a/exercises/4-creating-topology/ethanol.tcl b/exercises/5-creating-topology/ethanol.tcl similarity index 100% rename from exercises/4-creating-topology/ethanol.tcl rename to exercises/5-creating-topology/ethanol.tcl diff --git a/exercises/4-creating-topology/ethanol.xyz b/exercises/5-creating-topology/ethanol.xyz similarity index 100% rename from exercises/4-creating-topology/ethanol.xyz rename to exercises/5-creating-topology/ethanol.xyz diff --git a/exercises/4-creating-topology/pack.inp b/exercises/5-creating-topology/pack.inp similarity index 100% rename from exercises/4-creating-topology/pack.inp rename to exercises/5-creating-topology/pack.inp diff --git a/exercises/4-creating-topology/topo.tcl b/exercises/5-creating-topology/topo.tcl similarity index 100% rename from exercises/4-creating-topology/topo.tcl rename to exercises/5-creating-topology/topo.tcl diff --git a/exercises/4-creating-topology/water.tcl b/exercises/5-creating-topology/water.tcl similarity index 100% rename from exercises/4-creating-topology/water.tcl rename to exercises/5-creating-topology/water.tcl diff --git a/exercises/4-creating-topology/water.xyz b/exercises/5-creating-topology/water.xyz similarity index 100% rename from exercises/4-creating-topology/water.xyz rename to exercises/5-creating-topology/water.xyz