Batch Job & Interactive Job Submissions

Batch Job Submission

Batch job submission is recommended when submitting a job in the HPC. The job script, or SLURM script, is used to submit the job. In general, a script is similar to a bash script that contains SBATCH directives to request resources for the job, file manipulations commands to handle job files, and execution parts for running one or more programs that constitute the job. When the job is submitted using "sbatch" command, it assigns a unique "JobID" for the job.

Consider the SLURM script (example.slurm) below:

using scratch space (see HPC Temporary and Scratch Space):

#!/bin/bash

#SBATCH -o test.o%j

#SBATCH --time=00:30:00      # 20 minutes of wall time

#SBATCH -N 1                 # 1 Node

#SBATCH -n 1                 # 1 processor

#SBATCH -p gpu               # GPU partition

#SBATCH -A <Account>         # PI's caseID or for class <PI>_<class> e.g. sxg125_csds438

#SBATCH --mem=2gb             # Assign 2gb memory; default is 1gb

module load <software-module>

cp -r executable <inputFile1> <inputFile2> ...<inputFileN> $PFSDIR

cd $PFSDIR

./executable <arg1> <arg2> ... <argN> 

cp -ru * $SLURM_SUBMIT_DIR

The job file requests 30 minutes of wall time (time=00:30:00), requests one processor (n 1) in one node (N 1) for processing, and specifies with ".o%j" that stderr and stdout be redirected to the file name test.o<jobid>, respectively. These SLURM options are followed by bash commands to be executed by a shell after the job is submitted on a compute node. Load the software module to set the environment variables, copy the files (executable,inputFile1, inputFile2, ..) required for your job to the temporary (scratch) directory ($PFSPDIR) associated with the job, execute there, and copy all (*) output files back to the working directory ($SLURM_SUBMIT_DIR). The executable may require arguments (arg1 arg2 ..). For details on SBATCH directives, visit HPC Guide to SLURM.

Notes: For small amount of scratch data, you can use $TMPDIR instead of $PFSDIR but still $PFSDIR is recommended. Also, to remove Scratch Files Immediately, add the following to the end of the job scripts

rm -rf "$PFSDIR"/*

Submit the job:

sbatch example.slurm

We recommend you to use the scratch space, however, if your files are huge and you think it takes time to copy to PFSDIR, you can directly run the job from your working directory as showed:

without using scratch space (NOT RECOMMENDED):

#!/bin/bash

#SBATCH -o test.o%j

#SBATCH --time=00:30:00

#SBATCH -N 1

#SBATCH -n 1

#SBATCH --mem=2gb

module load <software-module>

./<executable>

For OpenMP jobs, request the number of processors "<x>" as required.

#SBATCH -N 1 -n <x>

For MPI jobs, request the number of nodes "<n>" and processors "<x>" as required and use "mpiexec" or "mpirun" along with your executable as showed:

#SBATCH -N <n> -n <x>

...

mpiexec <mpi-executable>

Interactive Job Submission

Notes:

For interactive GUI job submission, you need to have -X option for X-forwarding. For graphics intensive GUI (e.g. ParavIEW, Matlab, etc), it is recommended to access through NX client (HPC GUI Viz Access).  

srun --pty /bin/bash

You will be assigned a compute node (prompt similar to (<CaseID>@compxxx ~). You job will have a unique JobID. By default, the granted resources will be -N 1 -n 1 --mem=1gb --time=10:00:00

For GUI access:

srun --x11 -N 1 -n 2 --time=1:00:00 --pty /bin/bash

Note: This will launch two tasks (-n 2) on a single node (-N 1) for 1 hour (--time=1:00:00), with graphical windows (--x11) ready. If you want to request, let's say, 100 hrs, you can do it as --time=4-04:00:00 in the the format: "days-hours:minutes:seconds".

Check the default memory and walltime by using:

scontrol show job <JobID> | grep MinMemory

scontrol show job <JobID> | grep TimeLimit

output:

MinCPUsNode=1 MinMemoryCPU=1000M MinTmpDiskNode=0

   RunTime=01:30:36 TimeLimit=10:00:00 TimeMin=N/A

To run interactive jobs with 2 processor in 1 node with <m>gb memory of any node type: 

srun -N 1 -n 2 --mem=<m>gb --pty /bin/bash

Please request the nodes and processors appropriately to accommodate <m>gb looking here.

To reserve 2 processors on 2 nodes (total of 4 processors):

 srun --time=32:00:00 -N 2 -n 2 --pty /bin/bash

To request 1 processor in queue <Q>:

srun -p <Q> --pty /bin/bash

As an example, for your very high memory requirement, use smp queue to request Shared Memory nodes. For GPU requirement, use gpufermi to request GPU nodes.

Also, refer to Servers and Storage.

High Memory Jobs

Note: GPU memory can be viewed using nvidia-smi command:

nvidia-smi -l 2  # iterate after every 2 seconds

You need to assign appropriate memory to run your high memory job. It may get terminated before the completion when memory resource is not met. Calculate the amount of memory/processor from HPC Resource View and use the appropriate queue and the number of processors. Please also refer to Memory Estimation Guide. Memory leak can also be the culprit for high memory requirement. Please refer to HPC Guide to Debugging Segmentation Fault. Also, check out memory management when working with pandas at https://pythonsimplified.com/how-to-handle-large-datasets-in-python-with-pandas/.

Example (serial job):

if you require about 23 gb of memory for your serial job, then you can use 6 processors of hex core node (hex=> 6 cores; 12 processors). This may need to be adjusted based on HPC Resource View.

srun -N 1 -n 6 -C <NodeFeature> --mem=23gb --pty /bin/bash

You may want to reserve the whole node (-n 12) if you don't know in advance how much memory your job requires to be on the safer side.  The other way is to reserve the node exclusively with --exclusive flag; --mem=0 needs to be included to request all the memory available in the node.


srun --pty --exclusive --mem=<up-to-available-memory-in-the-node> -C <NodeFeature> --pty bash

You can check the memory utilization for the completed job using slurm command 

seff <jobID> 

sgeff <jobid> # for GPU usage

output:

...

CPU Utilized: 00:00:04

CPU Efficiency: 10.81% of 00:00:37 core-walltime

Memory Utilized: 79.66 MB

Memory Efficiency: 7.78% of 1.00 GB

Example (Parallel job):

If you have a parallel job that requires 120gb of memory, one option is using 2 of the 16 processors comp node(octa => 8 cores; 16 processors). This may need to be adjusted based on HPC Resource View.

srun -N 2 -n 16 -C octa64gb --mem=120gb --pty /bin/bash

If the general nodes can't fulfill your memory requirements, use HPC shared memory (SMP) nodes. Your job may remain in the queue longer.

srun -p smp --exclusive --mem=250gb --pty bash

Job Submission using different Accounts

If you are affiliated with more than one group and want to run jobs using account group that is not your default (primary) group, use the -A option in SLURM script as showed:

#SBATCH -A <groupCaseID>

To see the list of your affiliated shares, use the following command. You will see you groups.

id

output:

...

uid=680748(cbb48) gid=10001(lambrecht) groups=10001(lambrecht),10061(phys441)

...

More Info:

For requesting the proper resources, refer to  Servers and Storage. For other available SLURM options, refer to SLURM Usage Overview. If you want to be notified by emails, refer to Email Notification.