Deep learning systems have revolutionized field after another, leading to unprecedented empirical performance. Yet, their intricate structure led most practitioners and researchers to regard them as blackboxes, with little that could be understood. In this course, we will review experimental and theoretical works aiming to improve our understanding of modern deep learning systems.
Tuesday 16:00-17:00, UC87
Wednesday 15:00-16:00, MS2172
Friday 13:10-14:00, WI524
Office hours: Monday 15:10-16:00, BA6186
[20%]: Attendance and participation
[40%]: Tiny PyTorch coding exercises
[40%]: Final project on paper of your choosing
What is this course about?
Brief Introduction to Deep Learning
Brief Introduction to Optimization
Mysteries in Deep Learning
Information Bottleneck
Criticism of Information Bottleneck
Rethinking Generalization
Max-Margin Implicit Bias of Gradient Descent
Neural Collapse
Neural Networks as Gaussian Processes
Neural Tangent Kernel (NTK)
Measuring Spectrum of Deep Net Hessian at Scale
Introduction to Random Matrix Theory
Predicting Generalization Error Through NTK
Lazy Versus Active Training
Mean Field Analysis of Two-Layer Neural Networks
Max-Margin Implicit Bias of Two-Layer Neural Networks
Choose a paper
On the topic of theoretical or empirical investigation of deep learning.
You can consult me about your choice during office hours, via email, or through other communication channels.
Submit a two-page report
PDF format, 1-inch margins, font size 10pt, preferably typed in Latex.
1 page summarizing the paper.
1 page proving a novel theoretical result or proposing and implementing a novel experiment. You are encouraged to build your experiment on open-source implementations, if available.
Deadline is last day of the semester.
Present the report in the final lectures
Using slides (keynote, google slides, powerpoint, beamer, or other similar tools).
5 minute presentation of paper summary and your novelty followed by 1 minute of questions.
Pairs:
Twice longer report.
10 minutes of presentation followed by 2 minutes of questions.