Skip to content

Commit

Permalink
Merge pull request #6 from RuiApostolo/gh-pages
Browse files Browse the repository at this point in the history
Added GPU exercise and info on KOKKOS package
  • Loading branch information
RuiApostolo authored Oct 16, 2024
2 parents 3bbc561 + c642b5a commit 23b5b83
Show file tree
Hide file tree
Showing 12 changed files with 159 additions and 6 deletions.
48 changes: 45 additions & 3 deletions _episodes/06-extra-software.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,58 @@ keypoints:

## GPU acceleration

LAMMPS has the capability to use GPUs to accelerate the calculations needed to run a simulation, but the program needs to be compiled with the correct parameters for this option to be available.
LAMMPS has the capability to use GPUs to accelerate the calculations needed to run a simulation,
but the program needs to be compiled with the correct parameters for this option to be available.
Furthermore, LAMMPS can exploit multiple GPUs on the same system, although the performance scaling depends heavily on the particular system.
As always, we recommend that each user should run benchmarks for their particular use-case to ensure that they are getting performance benefits.
While not every LAMMPS force field or fix is available for GPU, a vast majority are, and more are added with each new version.
Check the LAMMPS documentation for GPU compatibility with a specific command.

To use the GPU accelerated commands, you will need to be an extra flag when calling the LAMMPS binary: `-pk gpu <number_of_gpus_to_use>`.
There are two main LAMMPS packages that add GPU acceleration: `GPU` and `KOKKOS`.
The differences are many, including which `fixes` and `computes` are implemented for each package (as always, consult the manual),
but the main difference is the underlying framework used in each.
The flags needed for each are also different, we have some examples of each below.

ARCHER2 has the `lammps-gpu` module, which has lammps compiled with KOKKOS.
Due to the model of AMD GPUs and compiler/driver versions available, it hasn't been possible to compile LAMMPS with the `GPU` package.
Cirrus, a Tier2 HPC system also hosted at EPCC, has a `lammps-gpu` module installed with the `GPU` package.

### KOKKOS package

In `exercises/4-gpu-simulation` you can find the Lennard-Jones input file used in exercise 2, with a larger number of atoms,
and a slurm script that loads the `lammps-gpu` module, and runs the simulation on a GPU.

The main differences are some of the `#SBATCH` lines, where we now have to set the number of GPUs to request, and use a different partition/qos combo:

```bash
...
#SBATCH --nodes=1
#SBATCH --gpus=1
...
#SBATCH --partition=gpu
#SBATCH --qos=gpu-shd

```

and the `srun` line, where we have to set the number of tasks, CPUS, the hints, and the distribution (slurm won't allow this on the `#SBATCH` lines),
as well as adding the `KOKKOS`-specific flags `-k on g 1 -pk kokkos -sf kk`


```bash
srun --ntasks=1 --cpus-per-task=1 --hint=nomultithread --distribution=block:block \
lmp -k on g 1 -pk kokkos -sf kk -i in.lj_exercise -l log_gpu.$SLURM_JOB_ID
```

The `-k on g 1` flag tells `KOKKOS` to use 1 GPU, and `-sf kk` adds the `\kk` suffic to all styles that support it.


### GPU package

To use the GPU accelerated commands, you would need to be an extra flag when calling the LAMMPS binary: `-pk gpu <number_of_gpus_to_use>`.
You will also need to add the `\gpu` suffix to all the styles intended to be accelerated this way or,
alternatively, you can use the `-sf gpu` flag to append the `\gpu` suffix to all styles that support it (though this is at your own risk).
So, for example, if ARCHER2 had GPUs, you would change the `srun` line from:

For example, if you were to run exercise 4 on Cirrus, you could change the line:

```
srun lmp -i in.ethanol -l log.$SLURM_JOB_ID
Expand Down
2 changes: 1 addition & 1 deletion exercises/1-performance-exercise/sub.slurm
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# Slurm job options (name, number of compute nodes, job time)
#SBATCH --job-name=lmp_ex1
#SBATCH --nodes=1
#SBATCH --time=0:20:0
#SBATCH --time=0:10:0
#SBATCH --hint=nomultithread
#SBATCH --distribution=block:block
#SBATCH --tasks-per-node=64
Expand Down
2 changes: 1 addition & 1 deletion exercises/2-lj-exercise/sub.slurm
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# Slurm job options (name, number of compute nodes, job time)
#SBATCH --job-name=lmp_ex2
#SBATCH --nodes=1
#SBATCH --time=0:20:0
#SBATCH --time=0:10:0
#SBATCH --hint=nomultithread
#SBATCH --distribution=block:block
#SBATCH --tasks-per-node=128
Expand Down
2 changes: 1 addition & 1 deletion exercises/3-advanced-inputs-exercise/sub.slurm
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# Slurm job options (name, number of compute nodes, job time)
#SBATCH --job-name=lmp_ex3
#SBATCH --nodes=1
#SBATCH --time=0:20:0
#SBATCH --time=0:10:0
#SBATCH --hint=nomultithread
#SBATCH --distribution=block:block
#SBATCH --tasks-per-node=128
Expand Down
85 changes: 85 additions & 0 deletions exercises/4-gpu-simulation/in.lj_exercise
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
####################################
# Example LAMMPS input script #
# for a simple Lennard Jones fluid #
####################################

####################################
# 0) Define vairables
####################################

variable DENSITY equal 0.8

####################################
# 1) Set up simulation box
# - We set a 3D periodic box
# - Our box has 10x10x10 atom
# positions, evenly distributed
# - The atom starting sites are
# separated such that the box density
# is 0.6
####################################

units lj
atom_style atomic
dimension 3
boundary p p p

lattice sc ${DENSITY}
region box block 0 50 0 50 0 50
create_box 1 box
create_atoms 1 box

####################################
# 2) Define interparticle interactions
# - Here, we use truncated & shifted LJ
# - All atoms of type 1 (in this case, all atoms)
# have a mass of 1.0
####################################

pair_style lj/cut 3.5
pair_modify shift yes
pair_coeff 1 1 1.0 1.0
mass 1 1.0

####################################
# 3) Neighbour lists
# - Each atom will only consider neighbours
# within a distance of 2.8 of each other
# - The neighbour lists are recalculated
# every timestep
####################################

neighbor 0.3 bin
neigh_modify delay 10 every 1

####################################
# 4) Define simulation parameters
# - We fix the temperature and
# linear and angular momenta
# of the system
# - We run with fixed number (n),
# volume (v), temperature (t)
####################################

fix LinMom all momentum 50 linear 1 1 1 angular
fix 1 all nvt temp 1.00 1.00 5.0
#fix 1 all npt temp 1.0 1.0 25.0 iso 1.5150 1.5150 10.0

####################################
# 5) Final setup
# - Define starting particle velocity
# - Define timestep
# - Define output system properties (temp, energy, etc.)
# - Define simulation length
####################################

velocity all create 1.0 199085 mom no

timestep 0.005

thermo_style custom step temp etotal pe ke press vol density
thermo 500

run_style verlet

run 50000
26 changes: 26 additions & 0 deletions exercises/4-gpu-simulation/sub.slurm
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/bin/bash

# Slurm job options (name, number of compute nodes, job time)
#SBATCH --job-name=lmp_ex4
#SBATCH --nodes=1
#SBATCH --gpus=1
#SBATCH --time=0:05:0

# The budget code of the project
#SBATCH --account=ta176
# Standard partition
#SBATCH --partition=gpu
# Short QoS since our runtime is under 20m
#SBATCH --qos=gpu-shd

# load the lammps module
module load lammps-gpu

# Set the number of threads to 1
# This prevents any threaded system libraries from automatically
# using threading.
export OMP_NUM_THREADS=1

# Launch the parallel job
srun --ntasks=1 --cpus-per-task=1 --hint=nomultithread --distribution=block:block \
lmp -k on g 1 -pk kokkos -sf kk -i in.lj_exercise -l log_gpu.$SLURM_JOB_ID
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 comments on commit 23b5b83

Please sign in to comment.