Deep learning systems have revolutionized field after another, leading to unprecedented empirical performance. Yet, their intricate structure led most practitioners and researchers to regard them as blackboxes, with little that could be understood. In this course, we will review experimental and theoretical works aiming to improve our understanding of modern deep learning systems.
Monday lecture: 16:10-18:00
Thursday lecture: 16:10-17:00
Wednesday office hour: 16:10-17:00
Zoom
[20%]: Attendance and participation
[5%]: Problem set 1
[35%]: 10 tiny PyTorch coding exercises, 3.5% each
[40%]: Final project on paper of your choosing
Introduction to sparsity
Sparse Coding
Convolutional Sparse Coding
Preferably on the topic of theoretical or empirical investigation of deep learning.
Examples of papers can be found below. However, you are encouraged to pick a paper not included in the list.
You can consult me about your choice during office hours, via email, or through other communication channels.
In the final 4 lectures of the course, students will present their projects.
Students will have 5 minutes of presentation followed by 1 minute of questions.
Pairs will have 10 minutes of presentation followed by 2 minutes of questions.
The presentation should use slides (keynote, google slides, powerpoint, beamer, or other similar tools).
The presentation should summarize the report.
Students will submit a two-page report:
1 page summarizing the paper
1 page proposing and implementing a novel experiment or proving a theoretical result.
You are encouraged to build your experiment on open-source implementations, if available.
Pairs will submit a twice longer report.
PDF format, 1-inch margins, font size 10pt, preferably typed in Latex.
The deadline is midnight on April 23rd.
Uniform Convergence May Be Unable to Explain Generalization in Deep Learning
Gradient Descent Provably Optimizes Over-Parameterized Neural Networks
On the Global Convergence of Gradient Descent for Over-Parameterized Models Using Optimal Transport
Insights on Representational Similarity in Neural Networks With Canonical Correlation
Towards Deep Learning Models Resistant to Adversarial Attacks