Instructor: Sumegha Garg (email: sumegha.garg@rutgers.edu)
Schedule: Tuesday 10:20-1:20 pm at Beck Hall 252 (Livingston Campus)
TA: Naina Chaturvedi (email: nc832@cs.rutgers.edu)
Office hours: TBD
Prerequisites: Undergraduate courses on algorithms, complexity theory, discrete mathematics and probability; mathematical maturity. Graduate courses on computability theory and algorithms (like CS 509 and 513) are strongly encouraged.
Course Overview
This is an research-oriented graduate course focused on learning theory under memory, communication and information constraints. The course will begin with fundamental concepts, including PAC learning, online learning, and combinatorial dimensions. After the initial lectures, the course will transition to a seminar format. Each week, students will read research papers, with one student presenting the key results to the class. A key learning outcome of the course is to develop the ability to read, understand, and present recent technical advances on the limits of learning under computational constraints.
Grading components
Attendance
Paper presentation and participation
3 Homeworks
Final project (presentations on Dec 9, 2025)
Weekly lectures and reading materials
Lecture 1 -- Overview, PAC learning and generalization for finite hypothesis classes. Handwritten notes: [pdf]
Lecture 2 -- Uniform convergence, VC dimension, PAC learning for bounded VC classes. Handwritten notes: [pdf]
Lecture 3 -- Uniform convergence of bounded VC classes, agnostic learning, computation complexity of learning. Handwritten notes: [pdf]
Lecture 4 -- Representation dependent hardness of PAC learning 3-term DNFs and agnostically learning halfspaces. Handwritten notes: [pdf]
Lecture 5 -- Perceptron algorithm, convex optimization and gradient descent. Handwritten notes: [pdf]
Lecture 6 -- Convergence analysis of gradient descent and variants. Handwritten notes: [pdf].
Paper presentations by Nilava on Agnostically Learning Halfspaces and Tanvi on SQ Lower Bounds for Agnostically Learning Halfspaces.
Lecture 7 -- Neural networks and their expressiveness. Handwritten notes: [pdf].
Paper presentations by Homit on Rethinking Generalization for Deep Learning, Wujiang on Implicit Regularization in Deep Learning and Yuhong on Nonvacuous PAC-Bayes bounds for Deep Learning.
Other resources on deep learning theory: Course by Ankur Moitra (https://people.csail.mit.edu/moitra/408c.html) and by Matus Telgarsky (https://mjt.cs.illinois.edu/dlt/)
Lecture 8 -- Theme: Distributed Learning.
Paper presentations by Harsha on Communication-Efficient Learning of Deep Networks from Decentralized Data, Jingyuan on Algorithms and Hardness Results for Parallel Large Margin Learning and Garrett on Limits of Distributed Algorithms for Learning.
Lecture 9 -- Theme: Practical approaches to memory-constrained learning.
Paper presentations by Adish on AI and Memory Wall, Rohit on MCUNetV2: Memory-Efficient Patch-based Inference and Atin and Can on Flash Attention algorithm.
Lecture 10 -- Brief introduction to streaming algorithms. Handwritten notes: [pdf].
Paper presentation by Jiawei on Memory, Communication, and Statistical Queries and Mursalin and Joe on Raz's Breakthrough result on Memory-Sample Tradeoffs for Parity Learning.
Lecture 11 -- Introduction to online learning. Handwritten notes: [pdf].
Paper presentations by Atreyi on Strong Memory Lower Bounds for Learning Natural Models and Mingyu and Xi on Memory Bounds for the Experts Problem.
Lecture 12 -- Theme: Private Learning.
Paper presentations by Artem on Robust De-anonymization of Large Sparse Datasets, Jiakang on Extracting Training Data from Large Language Models and Santosh on Deep Learning with Differential Privacy.
Additional resources
Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz and Shai Ben-David. PDF here.
Other related courses
Foundations of Machine Learning and Data Science (by Nina Balcan and Avrim Blum at CMU)
Theory of Machine Learning (by Vatsal Sharan at USC)
Foundations of Modern Machine Learning (by Nika Haghtalab at Cornell)