IFT 6135 - H2022 - Lectures

IFT 6135 - H2022

IFT 6135 - Representation Learning

Course Lectures

Gradescope

Piazza Workspace

Overleaf: H2022 (student)

Discord (IFT6135)

14 – Meta-Learning - Hugo Larochelle (11/04/2022)

FLIPPED CLASS -- PLEASE VIEW THE VIDEO BEFORE CLASS

A lot of the recent progress on many AI tasks was enable in part by the availability of large quantities of labeled data. Yet, humans are able to learn concepts from as little as a handful of examples. Meta-learning is a very promising framework for addressing the problem of generalizing from small amounts of data, known as few-shot learning. In meta-learning, our model is itself a learning algorithm: it takes as input a training set and outputs a classifier. For few-shot learning, it is (meta-)trained directly to produce classifiers with good generalization performance for problems with very little labeled data. In this talk, Hugo present an overview of the recent research that has made exciting progress on this topic (including my own) and, if time permits, will discuss the challenges as well as research opportunities that remain.

Video: Hugo's Lecture (from two years ago course) and the in-class discussion

Slides: Meta-Learning slides

13 – Self-Supervised Learning (30/03/2022-06/04/2022)

In this lecture, we will discuss self-supervised learning. We will discuss how to create representation beyond the supervised pre-training paradigm, and we are going to see how effective pretext tasks can be designed and how to train with contrastive objectives.

Slides:

SSL - pretext tasks slides
SSL - contrastive methods (plus bonus material at the end)

Video of SSL lecture I. (30/03/2022)
Video of SSL lecture II. (04/04/2022)
Video of SSL lecture III. (06/04/2022)

Reference:

Doersch, Carl, Abhinav Gupta, and Alexei A. Efros. "Unsupervised visual representation learning by context prediction." CVPR (2015).
Gidaris, Spyros, Praveer Singh, and Nikos Komodakis. "Unsupervised representation learning by predicting image rotations." ICLR (2018).
Wu, Zhirong, et al. "Unsupervised feature learning via non-parametric instance discrimination." CVPR (2018).
He, Kaiming, et al. "Momentum contrast for unsupervised visual representation learning." CVPR (2020).
Chen, Ting, et al. "Big self-supervised models are strong semi-supervised learners." (2020).
Grill, Jean-Bastien, et al. "Bootstrap your own latent: A new approach to self-supervised learning." NeurIPS (2020).

12 – GANs (23/03/2022-30/03/2022)

In this lecture, we will discuss Generative Adversarial Networks (GANs). GANs are a recent and very popular generative model paradigm. We will discuss the GAN formalism, some theory and practical considerations.

Slides:

GAN slides

Video of part I of GANs lecture. (23/03/2022)
Video of part II of GANs lecture. (28/03/2022)
Video of part III of GANs lecture. (30/03/2022 , just finishing up)

Reference: (* = you are responsible for this material)

*Sections 20.10.4 of the Deep Learning textbook.
*Generative Adversarial Networks by Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio (NIPS 2014).
*f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization by Sebastian Nowozin, Botond Cseke and Ryota Tomioka (NIPS 2016).
NIPS 2016 Tutorial: Generative Adversarial Networks by Ian Goodfellow, arXiv:1701.00160v1, 2016
Adversarially Learned Inference by Vincent Dumoulin , Ishmael Belghazi , Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky and Aaron Courville (ICLR 2017).
Many others refs in the slides.

11 – Variational Autoencoders (16/03/2022-21/03/2022)

In this lecture, we will discuss a family of latent variable models known as the Variational Autoencoders (VAE). We’ll see how a deep latent gaussian model can be seen as an autoencoder via amortized variational inference, and how such an autoencoder can be used as a generative model. At the end, we’ll take a look at variants of VAE and different ways to improve inference.

Slides:

Variational Autoencoders

Video of lecture I of Variational Autoencoders (16/03/2022).
Video of lecture II of Variational Autoencoders (21/03/2022).

Reference: (* = you are responsible for this material)

*Chapter 20.10.3 of the Deep Learning textbook.
*Chapter 2 of An Introduction to Variational Autoencoders by Kingma and Welling
Inference Suboptimality in Variational Autoencoders by Chris Cremer (ICML 2018)
Importance Weighted Autoencoders by Yuri Burda (ICLR 2016)
Variational Inference, lecture note by David Blei. Section 1-6.
Blog post Variational Autoencoder Explained by Goker Erdogan
Blog post Families of Generative Models by Andre Cianflone

10 – Normalizing Flows (14/03/2022)

In this lecture, we will have a crash course on Normalizing Flows, and see how they can be used as a generative model by inverting the transformation of the data distribution into a prior distribution.

Slides: Normalizing Flows

Video of Normalizing Flows lecture.

Reference: (* = you are responsible for this material)

*Chapter 20.10.2 of the Deep Learning textbook.
*Chapter 1-2 (for the core idea) and Chapter 6 (for applications) Normalizing Flows for Probabilistic Modeling and Inference by George Papamakarios and friends.
See the blog posts by Eric Jang: part 1, and part 2

09 – Autoencoders and Autoregressive Generative Models (07/03/2022-09/03/2022)

In this lecture we will take a closer look at a form of neural network known as an Autoencoder. We will also begin our look at generative models with Autoregressive Models.

Slides:

Video of lecture 1 (07/03/2022).

Video of lecture 2 (09/03/2022).

Reference: (* = you are responsible for this material)

*Chapter 13-14 of the Deep Learning textbook.
*Sections 20.10.5-20.10.10 of the Deep Learning textbook.
The Neural Autoregressive Distribution Estimator by Hugo Larochelle and Iain Murray (AISTAT2011) -- this is just a suggestion, we don't cover NADE in class.
MADE: Masked Autoencoder for Distribution Estimation by Mathieu Germain, Karol Gregor, Iain Murray, Hugo Larochelle (ICML2015).
Pixel Recurrent Neural Networks by Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu (ICML2016)
*Conditional Image Generation with PixelCNN Decoders by Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu (NIPS2016)
*Medium Post by Jessica Dafflon on PixelCNN's blindspot and how to fix it.

08 – Regularization (21/02/2022-23/02/2022)

In this lecture, we will have a rather detailed discussion of regularization methods and their interpretation.

Slides:

Regularization

Video of part I of this lecture (21/02/2022)

Video of part II of this lecture (23/02/2022) - I lost connection early on, but it should all be there.

Reference: (* = you are responsible for this material)

*Chapter 7 of the Deep Learning textbook.
Understanding deep learning requires rethinking generalization (ICLR 2017) by Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals

07 – Self-Attention and Transformer (14/02/2022-16/02/2022)

In this talk Arian Hosseini will look at self-attention and the transformer model. We will see how they work, dig deep into them, see analysis and performances, and their applications mainly in natural language processing. We will see how some language models, based on transformer architecture, have surpassed human performance on some language understanding tasks, and we will also discuss their shortcomings.

Slides:

Self-Attention and Transformer by A. Hosseini.

Video of lecture 1, part I (14/02/2022), chat (video ends abruptly due to technical issues)

Video of lecture 1, part 2 (14/02/2022)

Video of lecture 2 (16/02/2022)

Reference:

06 – Optimization and Normalization Methods (09/02/2022)

In this lecture, we will discussion both popular and practical first-order optimization methods. We will not discuss but I do provide slides for some second-order methods and their interpretation.

Slides:

Optimization I: First-Order Methods (and normalization methods)
Optimization II: Second-Order Methods (optional)

Video of this lecture (09/02/2022)
Video of the normalization methods (21/02/2022)

Reference: (* = you are responsible for this material)

*Chapter 8 of the Deep Learning textbook.
Why Momentum Really Works. Gabriel Goh, Distill 2017.

05 – Sequential Models (02/02/2022, 07/02/2022, 09/02/2022)

In this lecture we introduce Recurrent Neural Networks and related models.

Lecture on RNNs (slides derived from Hugo Larochelle)

Video of lecture 1 (02/02/2022)
Video of lecture 2 (07/02/2022)
Video of lecture 3 (09/02/2022)

Reference: (* = you are responsible for this material)

*Chapter 10 of the Deep Learning textbook (sections. 10.1-10.11, we will cover the material in 10.12 later).
Blog post on Understanding LSTM Networks by Chris Olah.

04 – PyTorch Tutorial (26/01/2022)

This lecture is an introductory tutorial on PyTorch by Ankit Vani. You are encouraged to follow along on Colab.

A Colab notebook for the tutorial can be found at the following link:

https://colab.research.google.com/drive/1Yt5Oyujw-l_F1EI96BiUL55490VPAdJG?usp=sharing

Video of Pytorch tutorial.

We will cover

the torch.Tensor class, and important attributes and operations
automatic differentiation in pytorch
torch.nn and torch.optim modules
training MLPs and ConvNets on MNIST

03 – ConvNets (19/01/2022, 24/01/2022, 31/01/2022, 02/02/2022)

In this lecture we finish up our discussion of training neural networks and we introduce Convolutional Neural Networks.

Lecture 03 CNNs (some slides are modified from Hugo Larochelle’s course notes)

Backprop in CNNs (Slides are from Hiroshi Kuwajima’s Memo on Backpropagation in Convolutional Neural Networks.) -- I won't go over these in class, but they are required reading and you are responsible for it. (could be on assignments and exam)

Video of lecture 1 (19/01/2022)

Video of lecture 2 (24/01/2022)

Video of Lecture 3 (31/01/2022)

Video of Lecture 4 (02/02/2022)

Reference: (* = you are responsible for all of this material)

*Chapter 9 of the Deep Learning textbook, Sections 9.10 and 9.11 are optional.
Andrej Karpathy’s excellent tutorial on CNNs.
Paper on convolution arithmetic by Vincent Dumoulin and Francesco Visin.
WaveNet Blog presenting dilated convolutions animation and samples.
Blog on Deconvolution and Checkerboard Artifacts by Augustus Odena, Vincent Dumoulin and Chris Olah.

02 – Training NNets & ML Problems (12/01/2022, 17/01/2022, 19/01/2022)

In these lecture(s) we continue with our introduction to neural networks and we will discuss how to train neural networks: i.e. the Backpropagation Algorithm

Lecture 02 training NNs (slides modified from Hugo Larochelle’s course notes)

Machine learning problems (Delayed: slides from Hugo Larochelle's CIFAR DLSS 2019 lectures)

Video (part 1 and part 2) of lecture 1. (12/01/2022)

Video of lecture 2. (17/01/2022)

Video of lecture 3. (19/01/2022)

Reference: (you are responsible for all of this material)

Chapter 6 of the Deep Learning textbook (by Ian Goodfellow, Yoshua Bengio and Aaron Courville).

01 – Introduction to Neural Networks (10/01/2022)

We discuss the plan for the course and the pedagogical method chosen. In this lecture we will also begin our detailed introduction to Neural Networks.

Lecture 01 artificial neurons (slides from Hugo Larochelle’s course notes)

Video of the lecture

Reference: (you are responsible for all of this material)

Chapter 6 of the Deep Learning textbook (by Ian Goodfellow, Yoshua Bengio and Aaron Courville).

00 – Review / Background Material (10/01/2022, mainly on your own)

The first class is January 10th, 2022. Review of some foundational material, covering linear algebra, calculus, and the basics of machine learning.

Lecture 00 slides (slides built on Hugo Larochelle’s slides)

Reference:

Chapters 1-5 of the Deep Learning textbook (by Ian Goodfellow, Yoshua Bengio and Aaron Courville).

Google Sites

Report abuse