HPC Batch & Interactive Job

Batch Job Submission

Batch job submission is recommended when submitting a job in the HPC. The job script, or SLURM script, is used to submit the job. In general, a Slurm script is a bash script that contains SBATCH directives to request resources for the job, file manipulations commands to handle job files, and execution parts for running one or more programs that constitute the job. Note that the SBATCH directives may not have blank lines between them -- the first blank line is interpreted as the end of the directive section. Generally, bash scripts may have blank lines and lines beginning with hash (#) are treated as comments, ingnored by the bash interpreter. https://www.gnu.org/software/bash/manual/bash.html

When the job is submitted using "sbatch" command, it assigns a unique "JobID" for the job.

Consider the SLURM script (example.slurm) below:

Using scratch space

(see HPC Temporary and Scratch Space):

#!/bin/bash

#SBATCH -o test.o%j

#SBATCH --time=00:30:00 # 30 minutes of wall time

#SBATCH -N 1 # 1 Node

#SBATCH -c 1 # 1 processor

#SBATCH -mem=2gb # Assign 2gb memory; default in 1gb

module load <software-module>

cp -r executable <inputFile1> <inputFile2> ...<inputFileN> $PFSDIR

cd $PFSDIR

./executable <arg1> <arg2> ... <argN>

cp -ru * $SLURM_SUBMIT_DIR

The job file requests 30 minutes of wall time (time=00:30:00), requests one processor (n 1) in one node (N 1) for processing, and specifies with ".o%j" that stderr and stdout be redirected to the file name test.o<jobid>, respectively. These SLURM options are followed by bash commands to be executed by a shell after the job is submitted on a compute node. Load the software module to set the environment variables, copy the files (executable,inputFile1, inputFile2, ..) required for your job to the temporary (scratch) directory ($PFSPDIR) associated with the job, execute there, and copy all (*) output files back to the working directory ($SLURM_SUBMIT_DIR). The executable may require arguments (arg1 arg2 ..).

Notes: For small amount of scratch data, you can use $TMPDIR instead of $PFSDIR but still $PFSDIR is recommended. Also, to remove Scratch Files Immediately, add the following to the end of the job scripts

rm -rf "$PFSDIR"/*

Submit the job:

sbatch example.slurm

We recommend you to use the scratch space, however, if your files are huge and you think it takes time to copy to PFSDIR, you can directly run the job from your working directory as showed:

Without using scratch space (NOT RECOMMENDED)

#!/bin/bash

#SBATCH -o test.o%j

#SBATCH --time=00:30:00

#SBATCH -N 1

#SBATCH -c 1

#SBATCH --mem=2gb

module load <software-module>

./<executable>

For OpenMP jobs, request the number of processors "<x>" as required.

#SBATCH -N 1 -c <x>

For MPI jobs, request the number of nodes "<n>" and tasks "<i>" and processors-per-task "<x>" as required and use "mpiexec" or "mpirun" along with your executable as showed:

#SBATCH -N <n> -n <i> -c <x>

...

mpiexec <mpi-executable>

Interactive Job Submission

See the appropriate Graphical Access section, if you would like to have graphical interface.

The following command request any node with the default request.

srun --pty /bin/bash

You will be assigned a compute node (prompt similar to (<CaseID>@compxxx ~). You job will have a unique JobID. By default, the granted resources will be -N 1 -n 1 --mem=1gb --time=10:00:00

For GUI access:

srun --x11 -N 1 -c 2 --time=1:00:00 --pty /bin/bash

Note: This will provide two processors (-c 2) on a single node (-N 1) for 1 hour (--time=1:00:00), with graphical windows (--x11) ready. If you want to request, let's say, 100 hrs, you can do it as --time=4-04:00:00 in the the format: "days-hours:minutes:seconds".

Check the default memory and walltime by using:

scontrol show job <JobID> | grep MinMemory

scontrol show job <JobID> | grep TimeLimit

output:

MinCPUsNode=1 MinMemoryCPU=1000M MinTmpDiskNode=0

RunTime=01:30:36 TimeLimit=10:00:00 TimeMin=N/A

To run interactive jobs with 2 processor in 1 node with <m>gb memory of any node type:

srun -N 1 -c 2 --mem=<m>gb --pty /bin/bash

Please request the nodes and processors appropriately to accommodate <m>gb looking here.

To reserve 2 processors on 2 nodes (total of 4 processors):

srun --time=32:00:00 -N 2 -n c --pty /bin/bash

To request 1 processor in partition <P>:

srun -p <P> --pty /bin/bash

Requesting Specific Nodes

To request a specific node, use the --nodelist option.

Request a specific node:

srun -p gpu -C gpu2v100 --nodelist=gpu059t --gres=gpu:1 --pty bash

It is requesting gpu059t. It is useful when we want to use a specific node only. Please be mindful that you might have to wait until the node is free to use if someone else is running a job in that node.

Exclude a particular node or multiple nodes from running your jobs:

srun --exclude=compt227,compt229 --pty bash

srun --exclude=compt[230-240] --pty bash

If you want to reserve the whole node, you can use, --exclusive

srun --exclusive --pty bash

Request an smp node in smp partition (-p smp).

srun -p smp -c 32 --mem=500gb --pty bash

Removing Scratch Space

Removing scratch space manually

rm -rf /scratch/pbsjobs/pbsjobs.<jobID>.hpc

Removing Scratch Files Immediately

It is very easy to Delete the /scratch job files before the job is completed, just add the following to the end of the PBS job scripts

rm -rf "$PFSDIR"/*

High Memory Jobs

You need to assign appropriate memory to run your high memory job. It may get terminated before the completion when memory resource is not met (see exit codes). Calculate the amount of memory/processor from HPC Resource View and use the appropriate queue and the number of processors. Please also refer to Memory Estimation Guide. Memory leak can also be the culprit for high memory requirement. Please refer to HPC Guide to Debugging Segmentation Fault.

Example (serial job):

if you require about 23 gb of memory for your serial job, then you can use 6 processors.

srun -N 1 -c 6 --mem=23gb --pty /bin/bash

You may want to reserve the whole node (--exclusive) if you don't know in advance how much memory your job requires. Please also use --mem=0 flag if you need to request all available memory in that node

srun --pty --exclusive --pty bash

Example (Parallel job):

If you have a parallel job that requires 120gb of memory, one option is using 2 nodes with as many cpu-core as are available (match the node resources).

srun -N 2 -c 24 --mem=120gb --pty /bin/bash # example applies to thread-parallel.

srun -N 2 -n 24 --mem=120gb --pty /bin/bash # example applies for MPI parallel.

Job Submission using different Accounts

If you are affiliated with more than one group and want to run jobs using account group that is not your default (primary) group, use the -A option in SLURM script as showed:

#SBATCH -A <groupAccount>

To see the list of your affiliated shares, use the following command. You will see you groups.

$ i

****Your SLURM's CPU Quota****

account_1 128

account_2 24

You need to include the following line in the slurm file:

#SBATCH -A account_2