HPC Projects Registration
Pilot

What are HPC Projects

HPC projects allow to manage HPC cluster users and resources and provide detailed usage tracking associated with different schools, departments, or research groups.

Types of Projects

PI may have projects for a range of topics, such as:

Study of Mice Brain Images
Study of Particle Physics
Course: Intro to Biology
Course: Applied Statistics

Each of such projects would have independent owner (PI), managers, members, access to special cluster resources (if approved)

Management Portal

Link: HPC Projects management

Types of users:

Project owner (PI) who may create new projects
Members (users who can run slurm jobs for this project)
Managers (change project, assign members/managers, etc)
Special resources approver
Usage inspector
HPC admin

Specifying Projects with Slurm

Any member of the project can run jobs on slurm for a specific project by specifying corresponding account information. Each project will be assigned an appropriate designation, generally starting with "pr_" for "project" followed by a numeric designation and a function.
For example:

srun --account=pr_3_general

In this case, the project specifies general resource access for Project 3.

Accessing Special Cluster Resources

Some schools/units of NYU have access to certain resources of cluster, such as priority access to GPUs owned by the unit.

A project owner may request permission for specific project to have access to such special resources. When a job is submitted with corresponding account specification, it will apply the priority access to such resource. This job at the same time have normal priority access to all other resources of cluster.

For example:

srun --account=pr_3_tandon_gpu

The above will activate the high priority access to Tandon-owned GPU resources. If project has access to multiple special resources, such as tandon_gpu and cds_gpu, access to only one of them will have high priority - whichever is specified in the slurm job submission. A user must choose which special resource to prioritize for specific job submission.

AMD GPUs

While Nvidia GPUs can be requested using the standard --gres=gpu Slurm command, AMD GPUs require a slightly different configuration.

One can add constraint to specify that AMD GPUs are needed. The following will request any AMD node, similarly to how --gres=gpu will request any Nvidia node on its own

--gres=gpu --constraint=amd

If you need a specific AMD GPU (MI50, MI100, MI25) then that should be specified in the constraint as well. The below example will only request MI100 or MI250 AMD GPUs.

--gres=gpu --constraint="mi100|mi250"

Page updated

Report abuse