SLURM Scheduler

SLURM (Simple Linux Utility for Resource Management) is a job scheduler and workload manager.   

It manages access to compute servers (nodes), provides a framework for running workloads - usually parallel jobs - on  these nodes, and manages a queue of pending jobs contending for resources.  For a more detailed introduction to SLURM, refer to:

SLURM is a critical component for a large computational resource such as the Coeus and Gaia HPC clusters.   Despite the "Simple" in its name, SLURM is a fairly complicated tool, and may require some familiarization to accomplish your desired workflow.  

Getting up to speed...

Initially your goal is just to get your process running. 

Then you want to make sure you can run it with the scheduler, making sure that you're using proper input data and getting all required output.  Lastly, you should make sure your process is running efficiently, using system resources as effectively as possible.  The typical steps for getting a computational process set up on the cluster are:

Common SLURM Commands

There are a number of  user commands users regularly employ for working with the scheduler.  

Example SBATCH Submission Scripts

The sbatch command is used to submit a job script to the scheduler for execution.  This script typically contains one or more srun to launch batch jobs or mpiexec commands to launch parallel tasks.  This is a shell script, so you can execute any command you can run in a batch script.  There are special sbatch commands that are flagged with the #SBATCH 


For more on --nodes, --ntasks, --ntasks-per-node, and more, refer to the bottom of this page's link to SLURM Parallelism.

Simple SBATCH example

File: sub_simplest.sh

#!/bin/bash                         # Required.

#SBATCH --job-name simple           # Set the name that shows up in squeue.

#SBATCH --nodes 2                   # Use 2 nodes.


srun hostname                       # hostname will print system name.

                                    # If 'srun' is omitted, this will only run on one node.


# So this script will print the system name of each of the 2 nodes it runs on.


File: sub_simple.sh

#!/bin/bash

#SBATCH --job-name simple

#SBATCH --nodes 2


# The %j variable includes the job number.  Useful for multiple runs

#SBATCH --output simple_%j.txt        # Send the standard output to simple_<job ID>.txt

#SBATCH --error simple_%j.err         # Send the error output to simple_<job ID>.err


srun hostname


File: sub_matlab.sh

#!/bin/bash


## super simple matlab example

#SBATCH --job-name myjob

#SBATCH --nodes 1


#SBATCH --partition medium            # Specify to put this job in the medium partition. If

                                      # not specified, the partition is set to medium.


#SBATCH --output myjob_%j.txt

#SBATCH --error myjob_%j.err


module load General/matlab/R2018a            # Load the Matlab module.

srun matlab -nodisplay -nojvm -r mymatlab    # Start Matlab without a GUI, and others.

Simple MPICH example

This is a simple "Hello World" MPI example

File: sub_mpi_hello.sh

#!/bin/bash


#SBATCH --job-name mpi_hello


#SBATCH --nodes 2                         # Use 2 nodes.

#SBATCH --ntasks-per-node 20              # For every node, allocate 20 copies of this job.

#SBATCH --time 10:00                      # Set the maximum time that the job can run.


#SBATCH --output mpi_hello_%j.txt

#SBATCH --error mpi_hello_%j.err


module load mpich/gcc

srun --mpi=pmi2 mpi_hello

Simple MVAPICH2-2.2 MPI example

This is a simple "Hello World" MPI example

File: sub_mpi_hello.sh

#!/bin/bash


#SBATCH --job-name mpi_hello

#SBATCH --nodes 2

#SBATCH --ntasks-per-node 20

#SBATCH --time 10:00

#SBATCH --output mpi_hello_%j.txt

#SBATCH --error mpi_hello_%j.err


## Load the MPICH for GCC-8.2.0 module

module load mvapich2-2.2-psm/gcc-8.2.0

mpiexec ./mpi_hello

Simple MPI Python example (using mpi4py)

This example uses the mpi4py package.  To install this package, you will first need to create a virtual environment to contain the project, and use pip to install the mpi4py package there.  For more on package management and virtual environments, refer to the Virtual Environments How-To.

Python code (hello_world_mpi.py):

from mpi4py import MPI


comm = MPI.COMM_WORLD

rank = comm.Get_rank()


print("hello world from process ", rank)

Submission script (hello_mpi_python.sh):

#!/bin/bash

#SBATCH --job-name hello_world_mpi

#SBATCH --time 00:10:00

#SBATCH --nodes 2

#SBATCH --output hello_world_mpi_py.txt


#SBATCH --ntasks 4                         # Allocate 4 possible jobs to fill.


module load Python/gcc/3.7.5/gcc-6.3.0

mpiexec -np 4 python3 hello_world_mpi.py   # Fill all 4 allocated jobs with a copy of this

                                           # script and have them communicate with MPI.

Program output (hello_world_mpi_py.txt):

hello world from process  0

hello world from process  2

hello world from process  1

hello world from process  3

Generic SBATCH Script Guideline

This is a suggestion to help make writing an SBATCH script easier - anything in  <>  is to be replaced with custom information.

#!/bin/bash

#SBATCH --job-name <Give your script a name.>

#SBATCH --partition <Select your partition.>


<Place your --nodes, --ntasks, and similar here.>

<For more, refer to below for the page on SLURM Parallelism.>


#SBATCH --output <Standard output file>.txt

#SBATCH --error <Error output file>.err


module load <Whatever module(s) that will be needed, if any.>

<Specify what to run - use mpiexec for MPI jobs or srun for independent jobs.>

A Deeper examination of Parallelism with SLURM

Visit the page on SLURM Parallelism for information on --nodes, --ntasks, --ntasks-per-node, and more.

Job Arrays

According to the Slurm Job Array Documentation, “job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily.” In general, job arrays are useful for applying the same processing routine to a collection of multiple input data files. Job arrays offer a very simple way to submit a large number of independent processing jobs.

A specified number of array tasks will be created by submitting a single job array sbatch script. 

For more on job arrays, visit the Job Arrays page.

Running Interactive processes on compute nodes

There are times that you may want to run an interactive session on a compute node.   For example, you may want to use the matlab graphic interface or command line interface. 

Submitting Interactive Jobs using SLURM's salloc command