Visual Recognition Beyond the Comfort Zone: Adapting to Unseen Concepts on the Fly

ICCV 2023 Tutorial

When: October 02, 2023 (8:45 AM - 1:00 PM)

Where: Paris Convention Centre, Paris, France (Room S04)

Visual recognition models often rely on a simple assumption: the training set contains all information needed to perform the target task. This assumption is violated in most practical applications since the number of semantic concepts and their compositions is too vast to be captured in a single training set, no matter its scale. Given these limitations, researchers are actively studying different ways to extend deep models beyond their comfort zone, i.e., the safe boundaries imposed by their training distribution.


There are two main ways to address this problem: the first is to prepare the model for unavailable semantic concepts (e.g., via zero-shot learning or transfer) and the second is to adapt the model on the fly, exploiting the stream of incoming data at deployment (e.g., via continual learning, open-world learning, or test-time training). Both strategies contain multiple rapidly growing research branches that vary in their assumptions, for instance ranging from open-set recognition to novel class discovery, from independent and identically distributed (IID) to non-IID streams, and from single to multiple labels. 


The aim of this tutorial is to provide an introduction to these topics, describing to attendees different ways to learn models that may adapt/transfer to unseen (or partially available) semantic knowledge. The tutorial will help beginners navigate the various definitions and approaches, providing a detailed overview of advances in various topics, ranging from open-world recognition to test-time training and compositional zero-shot learning.  At the same time, the tutorial will help practitioners in understanding the underlying assumptions and limitations of the related algorithms and problem formulations, clarifying their advantages and disadvantages for specific applications.

Speakers:

University of Montreal, Mila, DeepMind, Canada

NAVER LABS Europe, France

 Georgia Tech, USA

University of Trento, Italy

NAVER LABS Europe, France

 Schedule:

Short Speaker Bios:

Aishwarya Agrawal

Aishwarya Agrawal is an Assistant Professor in the Department of Computer Science and Operations Research at University of Montreal. She is also a Canada CIFAR AI Chair and a core academic member of Mila - Quebec AI Institute. She also spends one day a week at DeepMind as a Research Scientist. Aishwarya completed her Ph.D. in 2019 from Georgia Tech, working with Dhruv Batra and Devi Parikh. Aishwarya’s research interests lie at the intersection of computer vision, deep learning and natural language processing, with a focus on improving out-of-distribution generalization, compositional reasoning and data efficient adaptation to unseen tasks.

Tyler L. Hayes

Tyler Hayes is a Research Scientist on the Visual Representation Learning team at NAVER LABS Europe and a Board Member of the ContinualAI Non-Profit Organization. She completed her Ph.D. in Imaging Science at the Rochester Institute of Technology (RIT), advised by Christopher Kanan. Her research focuses on developing methods that move beyond the closed-world train/test paradigm to develop methods capable of lifelong and open-world learning.

Zsolt Kira

Zsolt Kira is an Assistant Professor at the Georgia Institute of Technology and Associate Director of Georgia Tech’s Machine Learning Center. His work lies at the intersection of machine learning and artificial intelligence for sensor processing, perception, and robotics, emphasizing moving beyond current limitations of machine learning to tackle unsupervised/semi-supervised methods, continual/lifelong learning, and adaptation.

Massimiliano Mancini

Massimiliano Mancini is an Assistant Professor at the Multimedia and Human Understanding Group at the University of Trento. Previously, Massimiliano was a postdoctoral researcher at the Explainable Machine Learning group at the University of Tubingen, led by Prof. Zeynep Akata. He completed his Ph.D. in Engineering in Computer Science at the Sapienza University of Rome, advised by Prof. Barbara Caputo and Prof. Elisa Ricci. His research focuses on transfer learning, and compositional generalization.

Riccardo Volpi

Riccardo is a research scientist at NAVER LABS Europe on the Visual Representation Learning team, where he carries out research in continual learning, domain adaptation, model robustness and explainable AI. Before joining NAVER, he was a postdoctoral researcher at Istituto Italiano di Tecnologia, where he also pursued his Ph.D., advised by Prof. Vittorio Murino. During his studies, he also spent time at Stanford University, advised by Prof. Silvio Savarese.

We hope to see you there!