Lifelong Robot Learning: Generalization, Adaptation, and Deployment with Large Models

Workshop @ RSS 2024

July 19th, 1:50pm - 6pm

Abstract

In recent years, we have witnessed the tremendous success of large models (so-called foundation models) in computer vision and natural language processing (NLP). Hence, there has been growing interest in repurposing existing large models or learning new ones for robotics problems. However, different from vision and language domains, there is no readily available internet-scale data for robotics to train such a model. As a result, an alternative is to train the large models for robotics in a lifelong fashion, where robots collect data during deployment, and refine the model on the fly. While the concept and formulation of lifelong learning originated from robotics, it has been mostly studied in vision and NLP. Now, in the presence of large models, it is the right time to revisit the lifelong robot learning problems in a scalable manner.

Invited Speakers

Jie Tan

Google DeepMind

Leonard Bärmann

(/Tamim Asfour)

KIT

Beomjoon Kim

KAIST

Matthew Gombolay

Georgia Institute of Technology

Rudolf Lioutikov

KIT

Discussion Topics

In this workshop, our objective is to unite participants to explore and envision the future paradigm of robot learning and deployment, emphasizing the continuous adaptation of robots to novel scenarios in the era of large models. This proposed workshop is intended for audience of backgrounds in robot learning, foundation models for robots, and lifelong learning in decision-making. We invite speakers and presenters from these aforementioned subfields, aiming to inspire meaningful discussions of generalization, adaptation, and deployment of lifelong robot learning methods, given the availability of large models.

We would like to gather people who are interested in generalist agents, lifelong learning, and robotics to discuss the following topics:

Is lifelong learning critical in building robots that can be deployed in the real world? Is large-scale multi-task learning enough for robotics (just like large language models that possess the emergent zero-shot ability)?
How should robots adapt to personal usage? How can robots continually learn from their users while maintaining privacy?
Will the large foundation model for robotics consist of a set of foundation models specialized in perception, control, and language, or will it be a single end-to-end large model? In other words, will it be more likely to be a compositional model or a single giant transformer-like model?
What properties an ideal lifelong learning algorithm should possess for efficiently adapting robotics models?
Do we need paradigm shifts in neural architectures of large models to support lifelong learning?
Given the discussion above, what type of benchmark problems should researchers work on? What metrics should we focus on?
How to leverage large foundation models to facilitate lifelong robot learning? What are the new opportunities and challenges of lifelong robot learning in the era of large models?