2nd Workshop on Out-of-Distribution Generalization in Robotics
Towards Reliable Learning-Based Autonomy

June 25, 2025

Main Workshop Room: EEB 132

Poster Session: Epstein Plaza

Robotics: Science and Systems 2025

Los Angeles, California

Zoom Information

https://stanford.zoom.us/j/97894311943?pwd=XTQb0bwtyleoPNtVbbSdzWzhwpADmN.1

Meeting ID: 978 9431 1943

Password: 811979

Workshop Overview

Invited Speakers and Panelists

Organizers

Contact

Workshop Overview

As we increasingly rely on AI-driven robotic autonomy stacks to contend with new tasks and unstructured environments, prudence requires that we also acknowledge the limitations of these systems. In practice, robots often fail to meet our expectations when deployed in the real world, where distribution shifts from training data, conditions evolve, and rare unpredictable corner cases degrade the reliability of robot systems powered by ML models. While these reliability concerns are well known, we lack a comprehensive roadmap to address these issues at all levels of a learned autonomy stack. Therefore, this workshop aims to bring together a diverse group of researchers and industry practitioners to chart a roadmap for 1) addressing the disruptive impact of distributional shifts and out-of-distribution (OOD) scenarios on robot performance and 2) examining opportunities to enable trustworthy autonomy in unfamiliar, OOD domains by leveraging emerging tools and novel insights.

Since the inaugural offering of this workshop at CoRL 2023, we have witnessed the rapid emergence of foundational tools, particularly multi-modal large language models, that provide a promising pathway to build systems that are trustworthy beyond an individual robot’s limited training distribution. Hence, we aim to rekindle a timely discussion on OOD reliability and trustworthy autonomy: Are we now in a position to unblock ourselves and build highly reliable and trustworthy robotic systems for the real world? What key challenges remain unsolved? Our diverse set of speakers and panelists reflects the opinions and operational demands from experts in core ML, to autonomous vehicles, household and warehouse manipulation, and drone racing, thereby providing a unifying view of the OOD problem across application domains.

This workshop aims to bring together a diverse group of researchers and industry practitioners to chart a roadmap for 1) addressing the disruptive impact of distributional shifts and out-of-distribution (OOD) scenarios on robot performance and 2) examining opportunities to enable trustworthy autonomy in unfamiliar, OOD domains leveraging new tools, such as foundation models. Therefore, this workshop broadly aims to address gaps between academia and practice by igniting discussions on research challenges and their synergies at all timescales crucial to improving reliability and deploying robust autonomous systems:

Safeguarding against OOD scenarios: How can we detect, predict, or reason about OOD scenarios that an AI-based robot is encountering to inform safe decision making? For example, can we leverage full-stack sensory information to anticipate errors and enact interventions to mitigate the consequences? What new tools and methods will help increase the trustworthiness of robots operating beyond the training distribution?
Extrapolation beyond nominal conditions: How can we develop, maintain, and utilize contextual understanding of a robot’s task and environment to facilitate generalization? What role should internet-scale models, e.g., LLMs and VLMs, play in extrapolating to OOD scenarios and beyond a robot’s training data?
Continual data lifecycle as we deploy, evaluate, and retrain learning-enabled robots: How can we efficiently collect data throughout deployment and develop learning algorithms that further improve system robustness and performance? What are the appropriate procedures for re-evaluating and re-certifying robots?
Towards task-relevant definitions of domain shift: There are many ways to define what makes data OOD, with specific choices depending on problem formulations and application contexts. What are the most task-relevant definitions for common robotics problems? How can we quantify generalization, and how will they influence approaches and experimental evaluations?

Invited Speakers and Panelists

Dorsa Sadigh

Professor, Stanford University

Dorsa Sadigh is an Assistant Professor in the Computer Science Department at Stanford University. Her research interests lie at the intersection of robotics, machine learning, and control theory. Specifically, her group is interested in developing efficient algorithms for safe, reliable, and adaptive human-robot and generally multi-agent interactions. She received her doctoral degree in Electrical Engineering and Computer Sciences (EECS) at UC Berkeley in 2017, and received her bachelor’s degree in EECS at UC Berkeley in 2012.

Andrea Bajcsy

Professor, CMU

Andrea Bajcsy is an Assistant Professor in the Robotics Institute and School of Computer Science at Carnegie Mellon University. She leads the Interactive and Trustworthy Robotics Lab (Intent Lab). The Intent Lab studies how to make learning-enabled robots safely and intelligently interact with humans. They draw upon methods from optimal control, dynamic game theory, Bayesian inference, and deep learning. Andrea obtained her Ph.D. in electrical engineering & computer science at UC Berkeley with Anca Dragan and Claire Tomlin. Previously, she was also a postdoctoral scholar with Jitendra Malik and worked at NVIDIA in the Autonomous Vehicle Research Group.

Masha Itkina

Research Lead and Manager, Toyota Research Institute

Masha Itkina is a roboticist. She works on helping robots understand the world around them. As part of the Stanford Intelligent Systems Laboratory (SISL), her research focused on perception for self-driving cars. Specifically, her interests lie in systems that bridge the gap between human driver behavior modeling, deep learning, and classical robotics algorithms. She aims to build a unified perception system that robustly predicts the evolution of the dynamic environment in time and makes inference into spatially occluded regions, while taking into account uncertainty, in order to intelligently inform autonomous decision making.

Huazhe Xu

Assistant Professor, Tsinghua University

Huazhe Xu is a Tenure-Track Assistant Professor at Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University. He is leading the Tsinghua Embodied AI Lab, where he builds robots and then brings intelligence to robots.

He was a postdoctoral researcher at Stanford Vision and Learning Lab (SVL) advised by Prof. Jiajun Wu. He obtained his Ph.D. in Berkeley AI Research (BAIR) advised by Prof. Trevor Darrell. He obtained his bachelor degree from Tsinghua University (major in EE, minor in Management).

Anirudha Majumdar

Associate Professor, Princeton University

Anirudha Majumdar is an Associate Professor in the Mechanical and Aerospace Engineering (MAE) department at Princeton University, and an Associated Faculty in Computer Science. He also holds a part-time research scientist position at Google DeepMind in Princeton. Majumdar's research is on enabling robots to generalize safely and reliably to novel scenarios in human-centered environments.

Ludwig Schmidt

Assistant Professor, Stanford University

Ludwig Schmidt is an Assistant Professor at Stanford in the Computer Science Department and Stanford Data Science. His research interests revolve around the foundations of machine learning, often with a focus on datasets, multimodality, reliable generalization, and language models. He is also a member of the technical staff at Anthropic and LAION.

Karl Pertsch (Panelist)

Postdoctoral Researcher, UC Berkeley & Stanford University

Karl Pertsch is a postdoc at UC Berkeley and Stanford University, where he works with Sergey Levine and Chelsea Finn on training robot foundation models. He is also a member of the technical staff at Physical Intelligence.