Instructor: Peter F. Klemperer, pklemperer@umass.edu
Lectures: TBA, on Zoom
Course web page: Canvas is the Learning Management System for this course.
TAs: TBA
Office hours:
Dr. Klemperer: TBA, on Zoom; 1-on-1 by appointment
TAs: by appointment; also available before and after labs
Piazza:
Past Course syllabus:
In this course, students will learn the fundamentals behind large-scale systems in the context of data science. We will cover the issues involved in scaling up (to many processors) and out (to many nodes) parallelism in order to perform fast analyses on large datasets. These include locality and data representation, concurrency, distributed databases and systems, performance analysis and understanding. We will explore the details of existing and emerging data science platforms, including MapReduce-Hadoop, Spark, and more. This course counts as a CS Elective for the CS Major. Undergraduate Prerequisite: COMPSC 377 and COMPSCI 445. 3 credits.
See reading list in the syllabus.