Workshop on Theory and Foundation of Continual Learning

ICML 2021

July 23rd (Virtual)

Ask questions to our speakers here


Machine learning systems are commonly applied to isolated tasks (such as image recognition or playing chess) or narrow domains (such as control over similar robotic bodies). It is further assumed that the learning system has simultaneous access to all annotated data points of the tasks at hand. In contrast, Continual Learning (CL), also referred to as Lifelong or Incremental Learning, studies the problem of learning from a stream of data from changing domains, each connected to a different learning task. The objective of CL is to quickly adapt to new situations or tasks by exploiting previously acquired knowledge, while protecting previous learning from being erased.

Significant advances have been made in CL over the past few years, mostly through empirical investigations and benchmarking. However, theoretical understanding is still lagging behind. For instance, while Catastrophic Forgetting (CF) is a recurring ineffectiveness that most works try to tackle, little understanding is provided in the literature from a theoretical point of view. Many real life applications share common assumptions and settings with CL, what are the convergence guarantees when deploying a certain method? If memory capacity is an important constraint for replay methods, how can we select the minimal examples such that CF is minimized? While answers to the questions above are key ingredients to design better heuristics, very little theoretical guidance is provided in the literature.

The aim of this workshop is to achieve an understanding of different components of continual learning to bridge the gap with empirical results. Furthermore, we are also interested in submissions that draw connections between Continual Learning and other areas, such as Neuroscience and Meta-learning. The specific research questions we hope to tackle include:

  • What is the mechanism causing catastrophic forgetting ?

  • For replay methods, when in a limited memory context, what are the minimum samples to be stored and how to select them?

  • Can we provide regret bounds on forgetting and on convergence of CL methods ?

  • Bayesian continual learning.

  • How is the tradeoff between plasticity and stability of neural networks affecting forgetting and accuracy?

  • What are the missing components compared to multi-tasks that would make continual learning perform as well and have less forgetting

  • How can we leverage out-of-distribution (OOD) generalization theory to better understand continual learning?

Schedule (July 23rd CET time)

14:00 - 14:30 Emtiyaz Khan - "K-priors: A General Principle of Adaptation" (Invited speaker)

14:00 - 14:50 Siddharth Swaroop - "Continual Deep Learning with Bayesian Principles" (Invited student)

14:50 - 15:00 Ziyang Wu - "Incremental Learning via Rate Reduction" (Contributed talk)

15:00 - 15:30 Christoph H. Lampert - "Learning Theory for Continual and Meta-Learning" (Invited speaker)

15:30 - 16:00 Joel Veness - "A compressed based perspective on continual learning" (Invited speaker)

16:00 - 18:00 Lunch / poster session

18:00 - 19:00 Panel discussion

19:00 - 19:30 Samory Kpotufe - "Optimism, and Adaptivity in Transfer and MultiTask Learning" (Invited speaker)

19:30 - 19:50 Iman Mirzadeh - "Linear Mode Connectivity in Multitask and Continual Learning" (Invited student)

19:50 - 20:00 Nicholas Soures - "TACOS: Task Agnostic Continual Learning in Spiking Neural Networks" (Contributed talk)

20:00 - 20:30 Chelsea Finn - "Learning Transferable Exploration Strategies via Meta Reinforcement Learning" (Invited speaker)

20:30 - 20:40 Sanket Vaibhav Mehta - "An Empirical Investigation of the Role of Pre-training in Lifelong Learning" (Contributed talk)