This repository contains instructions for installing and configuring Slurm on Ubuntu-20.04. There is not a lot of information available on this topic, so I created this guide to help others who are interested in using Slurm.
Before you begin, you will need to have access to some Ubuntu-20.04 machines. You will also need to have administrative privileges on these machines.
Follow these steps to install and configure Slurm on your Ubuntu-20.04 machine:
export MUNGEUSER=1001
sudo groupadd -g $MUNGEUSER munge
sudo useradd -m -c "MUNGE Uid 'N' Gid Emporium" -d /var/lib/munge -u $MUNGEUSER -g munge -s /sbin/nologin munge
export SLURMUSER=1002
sudo groupadd -g $SLURMUSER slurm
sudo useradd -m -c "SLURM workload manager" -d /var/lib/slurm -u $SLURMUSER -g slurm -s /bin/bash slurm
sudo apt install munge libmunge2 libmunge-dev
/usr/sbin/create-munge-key
sudo cp /etc/munge/munge.key ~
Copy munge.key
to /etc/munge/
directory of all machines (compute nodes and controller).
sudo chown -R munge: /etc/munge/ /var/log/munge/ /var/lib/munge/ /run/munge/
sudo chmod 0700 /etc/munge/ /var/log/munge/ /var/lib/munge/ /run/munge/
sudo chmod 0755 /run/munge/
sudo systemctl enable munge
sudo systemctl start munge
sudo systemctl status munge
munge -n | unmunge | grep STATUS
munge -n | unmunge
munge -n | ssh <somehost_in_cluster(IP or hostname)> unmunge
sudo apt update
sudo apt install mysql-server
sudo systemctl start mysql.service
sudo mysql
ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'password';
exit
sudo mysql_secure_installation
mysql -u root -p
CREATE USER 'slurm'@'localhost' IDENTIFIED WITH mysql_native_password BY 'password';
GRANT ALL ON slurm_acct_db.* TO 'slurm'@'localhost';
CREATE DATABASE slurm_acct_db;
\q
mysql -u root -p
SET GLOBAL innodb_buffer_pool_size=(2 * 1024 * 1024 * 1024);
SET GLOBAL innodb_log_file_size=(64 * 1024 * 1024);
SET GLOBAL innodb_lock_wait_timeout=900;
SET GLOBAL max_allowed_packet=(16 * 1024 * 1024);
\q
wget https://download.schedmd.com/slurm/slurm-23.11.1.tar.bz2
tar xvjf slurm-23.11.1.tar.bz2
cd slurm-23.11.1/
./configure --prefix=/usr --sysconfdir=/etc/slurm
make -j
sudo make install
#on controller
sudo cp etc/slurmctld.service /etc/systemd/system
sudo cp etc/slurmdbd.service /etc/systemd/system
#on compute nodes
sudo cp etc/slurmd.service /etc/systemd/system
sudo mkdir /var/spool/slurm /var/spool/slurm/d /var/spool/slurm/ctld
sudo mkdir /var/run/slurm /var/log/slurm
sudo chown slurm:slurm /etc/slurm/
sudo chmod 755 /etc/slurm/
Copy configuration files to /etc/slurm
. I’ve added sample configuration files for Slurm to this repository. You can create the slurm.conf
file by following the instructions provided at configurator. Note that the compute and controller node configuration files are usually the same. However, if you don’t use hostnames to identify your machines, you’ll need to set SlurmctldHost
in the compute node configuration files to the IP address of the controller. Additionally, in the sample configuration file, I assume that the database service is running on the same machine as the controller.
sudo apt install cgroup-tools
sudo touch /etc/slurm/cgroup.conf
#on controller
sudo systemctl start slurmdbd.service
sudo systemctl enable slurmdbdd.service
sudo systemctl start slurmctld.service
sudo systemctl enable slurmctld.service
#on compute nodes
sudo systemctl start slurmd.service
sudo systemctl enable slurmd.service
sbatch job-part1.sh
sacct
tail -f /var/log/slurm/slurmctld.log
tail -f /var/log/slurm/slurmdbd.log
tail -f /var/log/slurm/slurmd.log
To add a new plugin to Slurm, you can copy one of the plugins in slurm-23.11.1/src/plugins
and modify it as needed. Then, update the configure.ac
file to include your plugin path and directory. Finally, reconfigure and compile the project by running the following commands:
autoreconf -i
./configure --prefix=/usr --sysconfdir=/etc/slurm
make -j
sudo make install
That's it! You should now have Slurm installed and configured on your Ubuntu-20.04 machines. If you have any questions or run into any issues, feel free to reach out to me.