Course Schedule

Introduction

Natural Language Processing, Recap & Overview

Computer Vision, Recap & Overview

Joint visual-semantic embeddings

Image captioning and its evaluation

Attention, self-attention and Transformers

Visual question answering

Embodied AI, visual-language navigation

Multimodal learning

Dataset and model biases