Stephen Hawking famously said, ‘Intelligence is the ability to adapt to change.’ While today’s AI systems can achieve impressive performance in specific tasks, from accurate image recognition to super-human performance in games such as Go and chess, they are still quite "narrow", i.e. not being able to easily adapt to a wide range of new tasks and environments, without forgetting what they have learned before - something that humans and animals seem to do naturally during their lifetime. This course will focus on the rapidly growing research area of machine learning called continual learning (CL) which aims to push modern AI from "narrow" to "broad", i.e. to develop learning models and algorithms capable of never-ending, lifelong, continual learning over a large, and potentially infinite set of different environments and tasks. In this course, we will review the state-of-the-art literature on continual learning in modern ML, and some related work on stability vs plasticity in neuroscience. We focus on the catastrophic forgetting problem and recent approaches to overcoming it in deep neural networks, including regularization, replay and dynamic architecture methods; we also consider different CL settings (e.g., task-incremental, class-incremental, task-agnostic, etc). Furthermore, we review some recent advances in out-of-distribution generalization, a closely related ML area aimed at building robust models able to generalize well across multiple data distributions (environments).
Course Structure and Goals
Paper presentations: 40%
Class project (report + poster presentation): 50%
Class participation: asking questions, participating in discussions (on slack/in class): 10%
Note: due to time zone differences, it may be difficult for all students to join all classes in person; the classes will be recorded, and questions regarding the papers to be discussed can be submitted on the course slack.