Resources for Classes

We support educational activities at NYU in several ways.

If you are an instructor or TA of a course, please take a look at the options below. If you think we could help you to teach classes in other ways, let us know (Contact us).

Please note, we do not have a separated cluster dedicated for courses.

General HPC (Greene)

Classes may use Greene cluster, if this is absolutely necessary due to need of large datasets, heavy computations, etc. 

Notes:

One can find more information here

Special pre-allocation of resources

It is possible in some rear cases to get special allocation of nodes for specific class. However this is very uncommon and requires strong justification, as this prevents other users from using hardware.

Sometimes resources can be allocated in cloud instead, to have lower impact on other users' work, using cloud-bursting approach. Please contact us for details, if you believe your class needs this kind of resources. Additional cost may be associated with this approach

Hadoop (Dataproc): Big Data / HDFS / Spark

If you are an instructor or student of a Big Data class, and require students to learn Hadoop, Dataproc is a potential option. You can teach and learn the ecosystem of Hadoop/Spark technologies used in most companies working with large data.

For work on large datasets traditional HPC relies on specialized shared file systems (like Lustre, and BeeGFS) and a high speed network. In contrast most companies in industry rely on Hadoop/HDFS. Hadoop provides a perfect model of horizontal scaling - when data input/output requirements grow (for example number of users reading website or sending queries) additional nodes can be added to allow for faster read/write. Hadoop's map-reduce approach allows one to write code which brings computations to the same nodes where data is stored, and thus the impact of the relatively slower inter-node communications becomes less important.

Notes

Dedicated course coding environment (JupyterHub on GCP)

Dedicated environment provides some advantages comparing to students working on their laptops, or on HPC cluster. Advantages include high availability, no HPC queue, simple management of environment. For more info (including "who is paying") look at JupyterHub at ResearchCloud. We support classes of various sizes - from very small to classes of hundreds of students

Google Cloud (GCP)

List of classes using our services

Other resources you may find useful

Popular options comparison Table