COMPSCI 532 -
Systems for Data Science
Instructor: Peter F. Klemperer, pklemperer@umass.edu
Lectures: TBA, on Zoom
Course web page: Moodle is the Learning Management System for this course.
TAs: TBA
Office hours:
Dr. Klemperer: TBA, on Zoom; 1-on-1 by appointment
TAs: by appointment; also available before and after labs
Piazza:
Past Course syllabus:
Course Overview
In this course, students will learn the fundamentals behind large-scale systems in the context of data science. We will cover the issues involved in scaling up (to many processors) and out (to many nodes) parallelism in order to perform fast analyses on large datasets. These include locality and data representation, concurrency, distributed databases and systems, performance analysis and understanding. We will explore the details of existing and emerging data science platforms, including MapReduce-Hadoop, Spark, and more. This course counts as a CS Elective for the CS Major. Undergraduate Prerequisite: COMPSC 377 and COMPSCI 445. 3 credits.
Required Text
See reading list in the syllabus.