The Slurm Workload Manager
The Nova cluster uses Slurm to manage how jobs on the HPC cluster use the compute resources. It is the brains of the cluster. Users submit job requests that specify the application commands they want to run along with the hardware resources the job will need. Slurm places all the requests into a job queue where they wait until the requested resources become available. When the hardware is free, Slurm assigns the computer resources to the job and launches the application under the environment of the user who submitted it. Slurm tries to ensure that all users have fair use of the hardware. Slurm has an internal accounting system that keeps track of the resource allocation and usage by each research account.
Commonly Used Slurm Commands
There are several commands you can use to interact with Slurm. Many of these commands use similar command options. Refer to the command's man page to see many other details.
squeue - Show the Job Queue
Shows information about jobs in the queue. Some usage examples:
Show all jobs belonging to user.name: $ squeue -u user.name
Show all jobs belonging to the account account.name: $ squeue -A account.name
salloc - Request an interactive session
The salloc command is used to request an interactive shell instead of running it in batch mode (see sbatch below). This is a very useful way to do development and testing of an application. (For full graphical interactive sessions, see OnDemand).
Request an interactive session with 1 node, 32 cores, and 128G memory, for 4 hours:
$ salloc --nodes=1 --ntasks=32 --mem=128G --time=4:00:00
Request an interactive shell session on a system with 1 GPU but only 8 CPU cores:
$ salloc --nodes=1 --ntasks=8 --mem=128G --gres=gpu:1 --time=4:00:00
Request an interactive shell with 4 nodes, 32 cores per node, for 8 hours on the reserved partition:
$ salloc --nodes=4 --ntasks=128 --ntasks-per-node=32 --time=8:00:00 --partition=reserved
Commonly Used Job Parameters
Both salloc and sbatch use the same job allocation parameters. Here are the most commonly used ones:
--nodes=<number> The number of nodes requested.
--ntasks=<number> The total number of CPU cores requested.
--ntasks-per-node=<number> The max number of tasks (cores) being requested.
--gres=<resource> A consumable resource. Mostly used for GPUs, such as
--gres=gpu:1 (a single GPU)
--gres=gpu:a100:1 (a single A100 GPU)
--time=hh:mm:ss The time limit of the job
--time=dd-hh:mm:ss Same as above except dd is the number of days.
--chdir=<directory> Set the working directory (for sbatch only, not salloc).
--begin=<time> Sets the start time for the job.
--account=<account> Sets the account for the job. You must be a member of <account>.
sbatch - Submit batch scripts
Used to submit batch scripts to Slurm. This is how most jobs are submitted to the cluster. A batch script is usually just a bash shell script with the commands you want to run. The sbatch command has many different options for setting job parameters. However, these same options can also be set within the bash script by adding the option at the end of any line beginning with '#SBATCH'.
Example 1: Submit a simple batch script to the cluster:
The script file simple.sh contains these lines:
#!/bin/bash
# simple.sh
date
hostname
uname -a
w
Create this file with a text editor such as vim. Then make it executable: chmod +x simple.sh
Now we can submit this batch script to Slurm:
$ sbatch --time=5:00 simple.sh
Note the we must assign a time limit ( --time=5:00 ) for even the simplest of jobs.
Example 2: Submit a batch job script called my-job.sh that contains the job parameters within the script.
$ sbatch my-job.sh
The script my-job.sh contains these lines:
#!/bin/bash
#SBATCH --time=8:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --mem=128G
#SBATCH --gres=gpu:1
#SBATCH --job-name="Exp 12, Case B"
module load starccm
scontrol show hostname $SLURM_NODELIST > machine-file.txt
starccm+ -batch -mpi intel -np 8 --gpgpu auto -machinefile machine-file.txt \ Wing1.sim
To submit this script, do:
$ sbatch my-job.sh
srun - Run parallel commands on the cluster.
Another alternative to running batch jobs. In this case, you run the parallel command itself (such as mpirun). Usually run from the head node. Good for running single commands from the head node without running them on the head node.
$ srun -N 1-n 16 -t 2:00:00 mpirun -n 16 my-app
sacctmgr - View or modify Slurm account information
Slurm uses a database to keep track of Slurm account information. One very import piece of account information is the "qos associations" for each account group. These are special groups used by Slurm to restrict who has access to different resources on the cluster. To see what qos assocations have been set for your account (or other user accounts), you can use the sacctmgr command like so:
$ sacctmgr show assoc user=<username> format=account,qos -p
where <username> is the ISU Netid. This will return the list of accounts and their qos associations.