IFT 6135 - H2021 - Lectures

IFT 6135 - Representation Learning

Course Lectures

15 – Meta-Learning - Hugo Larochelle (14/04/2021)

FLIPPED CLASS -- PLEASE VIEW THE VIDEO BEFORE CLASS

A lot of the recent progress on many AI tasks was enable in part by the availability of large quantities of labeled data. Yet, humans are able to learn concepts from as little as a handful of examples. Meta-learning is a very promising framework for addressing the problem of generalizing from small amounts of data, known as few-shot learning. In meta-learning, our model is itself a learning algorithm: it takes as input a training set and outputs a classifier. For few-shot learning, it is (meta-)trained directly to produce classifiers with good generalization performance for problems with very little labeled data. In this talk, Hugo present an overview of the recent research that has made exciting progress on this topic (including my own) and, if time permits, will discuss the challenges as well as research opportunities that remain.

Video: Hugo's Lecture (from last year's course) and the discussion from this lecture

Slides: Meta-Learning slides

14 – Computer vision: detection and segmentation (12/04/2021)

In this lecture, Krishna Murthy will discuss some of the more advanced application of deep learning to vision tasks, including (time-permitting):

Intro to object detection - Task specification, evaluation metrics (mIoU, mAP)
Two-stage detectors (RCNN family)
One-stage detectors (SSD, YOLO, RetinaNet)
Segmentation - Task specification, evaluation metrics
One-stage: SegNet, FCN, U-Net
DeepLab class of models
Open challenges in object detection and segmentation (open-set categories, interpretability, real-time operability)

Slides:

CV_Slides

Video of lecture.

13 – Self-Supervised Learning (07/04/2021)

In this lecture, Christos Tsirigotis will discuss self-supervised learning. We will discuss how to create representation beyond the supervised pre-training paradigm, and we are going to see how effective pretext tasks can be designed and how to train with contrastive objectives.

Slides:

SSL slides

Video of SSL lecture.

Reference:

Doersch, Carl, Abhinav Gupta, and Alexei A. Efros. "Unsupervised visual representation learning by context prediction." CVPR (2015).
Gidaris, Spyros, Praveer Singh, and Nikos Komodakis. "Unsupervised representation learning by predicting image rotations." ICLR (2018).
Wu, Zhirong, et al. "Unsupervised feature learning via non-parametric instance discrimination." CVPR (2018).
He, Kaiming, et al. "Momentum contrast for unsupervised visual representation learning." CVPR (2020).
Chen, Ting, et al. "Big self-supervised models are strong semi-supervised learners." (2020).
Grill, Jean-Bastien, et al. "Bootstrap your own latent: A new approach to self-supervised learning." NeurIPS (2020).

12 – GANs (29/03/2021-31/03/2021)

In this lecture, we will discuss Generative Adversarial Networks (GANs). GANs are a recent and very popular generative model paradigm. We will discuss the GAN formalism, some theory and practical considerations.

Slides:

GAN slides

Video of part I of GANs lecture.

Video of part II of GANs lecture.

Reference: (* = you are responsible for this material)

*Sections 20.10.4 of the Deep Learning textbook.
*Generative Adversarial Networks by Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio (NIPS 2014).
*f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization by Sebastian Nowozin, Botond Cseke and Ryota Tomioka (NIPS 2016).
NIPS 2016 Tutorial: Generative Adversarial Networks by Ian Goodfellow, arXiv:1701.00160v1, 2016
Adversarially Learned Inference by Vincent Dumoulin , Ishmael Belghazi , Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky and Aaron Courville (ICLR 2017).
Many others refs in the slides.

11 – Variational Autoencoders (22/03/2021-24/03/2021)

In this lecture, we will discuss a family of latent variable models known as the Variational Autoencoders (VAE). We’ll see how a deep latent gaussian model can be seen as an autoencoder via amortized variational inference, and how such an autoencoder can be used as a generative model. At the end, we’ll take a look at variants of VAE and different ways to improve inference.

Slides:

Variational Autoencoders

Video of part I of Variational Autoencoders lecture.

Video of part II of Variational Autoencoders lecture.

Reference: (* = you are responsible for this material)

*Chapter 20.10.3 of the Deep Learning textbook.
*Chapter 2 of An Introduction to Variational Autoencoders by Kingma and Welling
Inference Suboptimality in Variational Autoencoders by Chris Cremer (ICML 2018)
Importance Weighted Autoencoders by Yuri Burda (ICLR 2016)
Variational Inference, lecture note by David Blei. Section 1-6.
Blog post Variational Autoencoder Explained by Goker Erdogan
Blog post Families of Generative Models by Andre Cianflone

10 – Normalizing Flows (17/03/2021-22/03/2021)

In this lecture, we will have a crash course on Normalizing Flows, and see how they can be used (1) as a generative model by inverting the transformation of the data distribution into a prior distribution.

Slides: Normalizing Flows

Video of part I of Normalizing Flows lecture.

Video of part II of Normalizing Flows lecture.

Reference: (* = you are responsible for this material)

*Chapter 20.10.2 of the Deep Learning textbook.
*Chapter 1-2 (for the core idea) and Chapter 6 (for applications) Normalizing Flows for Probabilistic Modeling and Inference by George Papamakarios and friends.
See the blog posts by Eric Jang: part 1, and part 2

09 – Autoencoders and Autoregressive Generative Models (15/03/2021-17/03/2021)

In this lecture we will take a closer look at a form of neural network known as an Autoencoder. We will also begin our look at generative models with Autoregressive Models.

Slides:

Video of Autoencoders lecture.

Video of part I of Autoregressive Generative Models lecture.

Video of part II of Autoregressive Generative Models lecture.

Reference: (* = you are responsible for this material)

*Chapter 13-14 of the Deep Learning textbook.
*Sections 20.10.5-20.10.10 of the Deep Learning textbook.
The Neural Autoregressive Distribution Estimator by Hugo Larochelle and Iain Murray (AISTAT2011)
MADE: Masked Autoencoder for Distribution Estimation by Mathieu Germain, Karol Gregor, Iain Murray, Hugo Larochelle (ICML2015).
Pixel Recurrent Neural Networks by Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu (ICML2016)
*Conditional Image Generation with PixelCNN Decoders by Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu (NIPS2016)

08 – Regularization (08/03/2021-10/03/2021)

In this lecture, we will have a rather detailed discussion of regularization methods and their interpretation.

Slides:

Video of part I of this lecture (08/03/2021)

Video of part II of this lecture (10/03/2021)

Reference: (* = you are responsible for this material)

*Chapter 7 of the Deep Learning textbook.
Understanding deep learning requires rethinking generalization (ICLR 2017) by Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals

07 – Optimization and Normalization Methods (24/02/2021)

In this lecture, we will discussion both popular and practical first-order optimization methods. We will not discuss but I do provide slides for some second-order methods and their interpretation.

Slides:

Optimization I: First-Order Methods (and normalization methods)
Optimization II: Second-Order Methods (optional)

Video of this lecture (24/02/2021)

Reference: (* = you are responsible for this material)

*Chapter 8 of the Deep Learning textbook.
Why Momentum Really Works. Gabriel Goh, Distill 2017.

06 – Self-Attention and Transformer (17/02/2021-22/02/2021)

In this talk Arian Hosseini will look at self-attention and the transformer model. We will see how they work, dig deep into them, see analysis and performances, and their applications mainly in natural language processing. We will see how some language models, based on transformer architecture, have surpassed human performance on some language understanding tasks, and we will also discuss their shortcomings.

Slides:

Self-Attention and Transformer by A. Hosseini.

Video of this lecture, part I (15/02/2021)

Reference:

05 – Sequential Models (15/02/2021, 17/02/2021)

In this lecture we introduce Recurrent Neural Networks and related models.

Lecture 05 RNNs (slides derived from Hugo Larochelle)

Video of this lecture, part I (15/02/2021)

Video of this lecture, part II (17/02/2021)

Reference: (* = you are responsible for this material)

*Chapter 10 of the Deep Learning textbook (sections. 10.1-10.11, we will cover the material in 10.12 later).
Blog post on Understanding LSTM Networks by Chris Olah.

04 – PyTorch Tutorial (03/02/2021)

This lecture is an introductory tutorial on PyTorch by Krishna Murthy. You are encouraged to follow along on Colab.

A Colab notebook for the tutorial can be found at the following link:

https://colab.research.google.com/drive/108ilPjSdWBEGAqqPGyXWzJKAcGpPY2kn?usp=sharing

Video of Pytorch tutorial (03/02/2021)

Video of colab tutorial (10/02/2021)

We will cover

the torch.Tensor class, and important attributes and operations
automatic differentiation in pytorch
torch.nn and torch.optim modules
training MLPs and ConvNets on MNIST

03 – ConvNets I and II (01/02/2021, 08/02/2021 and 10/02/2021)

In this lecture we finish up our discussion of training neural networks and we introduce Convolutional Neural Networks.

Lecture 03 CNNs (some slides are modified from Hugo Larochelle’s course notes)

Backprop in CNNs (Slides are from Hiroshi Kuwajima’s Memo on Backpropagation in Convolutional Neural Networks.) -- I won't go over these in class, but they are required reading and will be the basis of one of the questions in Assignment 1.

Video of this lecture, part I (01/02/2021). note that it cuts off rather abruptly when I lost power at home!

Video_(a) and Video_(b) of this lecture, part II (08/02/2021) - in two parts to do technical difficulties

Video of this lecture, part III (10/02/2021).

Reference: (* = you are responsible for all of this material)

*Chapter 9 of the Deep Learning textbook, Sections 9.10 and 9.11 are optional.
Andrej Karpathy’s excellent tutorial on CNNs.
Paper on convolution arithmetic by Vincent Dumoulin and Francesco Visin.
WaveNet Blog presenting dilated convolutions animation and samples.
Blog on Deconvolution and Checkerboard Artifacts by Augustus Odena, Vincent Dumoulin and Chris Olah.

02 – Training NNets & ML Problems (27/01/2021)

In these lecture(s) we continue with our introduction to neural networks and we will discuss how to train neural networks: i.e. the Backpropagation Algorithm

Lecture 02 training NNs (slides modified from Hugo Larochelle’s course notes)

Machine learning problems (Delayed: slides from Hugo Larochelle's CIFAR DLSS 2019 lectures)

Video of this lecture, part I (27/01/2021).

Reference: (you are responsible for all of this material)

Chapter 6 of the Deep Learning textbook (by Ian Goodfellow, Yoshua Bengio and Aaron Courville).

01 – Introduction to Neural Networks (25/01/2021)

We discuss the plan for the course and the pedagogical method chosen. In this lecture we will also begin our detailed introduction to Neural Networks.

Lecture 01 artificial neurons (slides from Hugo Larochelle’s course notes)

Video of the lecture

Reference: (you are responsible for all of this material)

Chapter 6 of the Deep Learning textbook (by Ian Goodfellow, Yoshua Bengio and Aaron Courville).

00 – Review / Background Material (18/01/2021 - 20/01/2021)

The first class is January 18th, 2021. Review of some foundational material, covering linear algebra, calculus, and the basics of machine learning.

Lecture 00 slides (slides built on Hugo Larochelle’s slides)

Video of the lecture 1

Video of the lecture 2

Reference:

Chapters 1-5 of the Deep Learning textbook (by Ian Goodfellow, Yoshua Bengio and Aaron Courville).

Google Sites

Report abuse