Job Scheduling
Job Scheduling
Important Tips:
Avoid Running Jobs on the Login Nodes
Please DO NOT use the login-node (e.g. hpc1 or hpc2) for running your jobs. Always use the "sbatch" command to run your jobs. If you are using interactive job submission -- running graphics (e.g. MATLAB), scripts, and other STDIO -- use the command "srun --x11 --pty bash" which assigns you a compute node. Jobs running on the login-node will be killed. If you have already run your job, cancel it using command "kill <PID>". You can get PID by running command "top" at login-node. For killing all processes use:
kill -9 `ps -ef | grep <caseID> | grep -v grep | awk '{print $2}'`
Job Locations
Running the jobs regularly is done on the scratch space ($PFSDIR), instead of your home directory. The home directory is limited by the group storage quota, which can cause jobs to stop prematurely when running out of space. If the jobs also require high Input/Output processes, your home directory access might be impacted.
Each group has storage quota limits. To find more information about storage limits and disk usage please refer to Storage & Quota.
On GPU nodes , we provide fast SSD drives for scratch space. On Rider cluster, use $TMPDIR for gpu2v100, gpu4v100 and gpu2080 nodes. On Markov cluster, use /mnt/fs1 as the scratch space.
If your job creates a large number of output files, or you have a case where the number of files in a directory is huge, please follow the Panasas Storage Guideline for Huge Directory.
Node Partitions
We have batch, smp, and gpu queues or node partitions.
Additional node features (with "-C") can be included in the job request to classify the nodes requested further.
See the HPC Resource View for which nodes belong to each partition and features.