There are several ways one can use to access the cluster. We only show one example below. Please refer to the Accessing HPC page to read about other options.
Simply open a terminal (Linux, Mac) or the Command Prompt (Windows 10) and enter the commands:
ssh <NetID>@gw.hpc.nyu.edu ## you can skip this step if you are on the NYU Network or using the NYU VPNNOTE: When you are asked to enter password, just type it (letters will not be displayed), and then hit "Enter"
There are several types of jobs that can be run on the Greene cluster. The traditional type of workload for HPC is called a "batch job" - where a job is sent to the system to execute a function or program without further user input or interaction.
Users can submit batch jobs to the SLURM scheduler (a software which manages the jobs queue on Greene).
For other job types please read here.
Using a text editing program such as vim, nano, or emacs, create a python script. Below is an example called 'hello-world.py':
import osprint("Hello from the compute node", os.uname()[1])The script loads the os library in order to print out the name of the server that the script is run on.
To run our python script we need to submit a batch job to SLURM to allocate the necessary compute resources. SLURM expects file in the following format in order to execute the job. The file contains both commands specific for SLURM to interpret as well as programs for it execute. Below is a simple example of a batch job to run the python script we just created, the file is named "hello-python.sbatch":
You then submit the job with the following command:
$ sbatch hello-python.sbatchThe command will result in the job queuing as it awaits resources to become available (which varies on the number of other jobs being run on the cluster). You can see the status of your jobs with the following command:
$ squeue -u $USERThere are many more ways how you can allocate resources in SLURM. Please read here for more details.
Once the job has completed, you can read the output of your job in the python-hello-world.out file. This is where logs regarding the execution of your job can be found, including errors or system messages. You can print the contents to the screen from the directory containing the output file with the following command:
$ cat python-hello-world.outPython natively will only run on a single CPU core. This is why the output contains only one printed statement, even though we requested several nodes and 2 tasks per node. Your output may look something like the following:
Hello from the compute node cs240.nyu.clusterWe can make our script use all the resources allocated by SLURM. We can accomplish this by parallelizing the batch script, which we will do in the next step.
After we allocated multiple nodes resources (Step 1), we can run scripts on multiple nodes.
One of the most popular software tools to run code on multiple nodes/cores at HPC is "Open MPI".
Here is a simple example of using Open MPI with the job we ran before
The mpirun command will execute our hello-world.py script using each requested node/core. You can see it in the output:
Hello from the compute node cs240.nyu.clusterUsing mpirun, the batch job ran the hello-world.py script on three nodes on two separate CPU cores.
Now you can modify your program, so it can leverage available resources.
As an example, in Python, you can use package mpi4py.
First install package by executing in terminal
Create a new python file 'hello-world_parallel.py' (more examples can be found here)
import osfrom mpi4py import MPIAnd modify batch file
To request one GPU, use SBATCH directives in job script
#SBATCH --gres=gpu:1We recommend to use Singularity images containing pre-installed software for GPU jobs. You can use overlay to install additional packages as described here. Here is an example with an overlay image containing Miniconda and PyTorch 'overlay-10GB-400K-tensorflow-pytorch.ext3'
srun --pty -c 2 --mem=10GB --gres=gpu:rtx8000:1 /bin/bash # request an RTX8000 GPU nodeNow we can run SLURM job script 'run-test.SBATCH', that will start our Singularity Image and call the 'torch-test.py' script.
Note: If you started an interactive job with singularity above, exit it first.
There are other kinds of jobs that can be run on Greene like interactive jobs and jobs arrays. The links below give further information on the usage of SLURM.
Examples available on Greene
/scratch/work/public/examplesOpen OnDemand is a tool that allows users to launch Graphical User Interfaces (GUIs) based applications are accessible without modifying your HPC environment.
Before you can use Open OnDemand, you need to login to cluster at least once using terminal (this will create home directory) - otherwise OOD won't connect
You can log into the Open OnDemand interface at https://ood.hpc.nyu.edu.
Once logged in, select the Interactive Apps menu, select the desired application, and submit the job based on required resources and options.
Jupyter Notebook with Conda environments and Singularity: instructions
Several IDEs/Apps are available, and can be accessed using "Interactive Apps" menu item
You can view software available on Greene as well as instructions on installing new software at the Greene Software page.
Please read