Making Sense of Data in Robotics:
Composition, Curation, and Interpretability at Scale
Making Sense of Data in Robotics:
Composition, Curation, and Interpretability at Scale
Location: COEX Convention & Exhibition Center, Room TBA
Everyone wants a working robot that reliably completes goals, interacts with users, and remains safe. Across robot learning approaches – whether scaling imitation learning, building predictive world models, or adapting foundation models for embodied reasoning – the choice of training data is emerging as a critical driver of performance. Yet, while data underpins almost all successful robot learning systems, few often stop to question what specific aspects of the dataset led to the success. Current hypotheses about what constitutes "good" data for robot learning tend to be heuristic and at times contradictory. Arguments may favor either diversity or uniformity, regard multi-modality as beneficial or detrimental, or consider errors in demonstrations either as harmful or as useful when recoverable. This workshop brings together diverse perspectives from the robot learning community and broader ML fields to advance a deeper, more principled understanding of what makes robot learning data “good?”
In spite of increasing investment in large-scale robotics data – through teleoperation or fleet deployment – collection has outpaced our scientific understanding of what constitutes effective data for robot learning. The way forward is unclear: significant progress has emerged from small, carefully crafted datasets, world modeling, parallel simulation, and representations from off-domain data sources in vision and language, all of which make different decisions about what data to use or collect. Moreover, as our robot systems increase in capability, choices become more nuanced – for instance, which data characteristics matter when pushing policy performance from 90% to 99%. The data-design space for robot learning is massive, but seldom discussed in depth.
In sum, this workshop seeks to bring together academic researchers and industry practitioners towards the development of a science of data for robot learning, structured along the following themes:
🧩 Theme 1: Data Composition – What data should we use in robotics?
What properties and modalities of data (e.g., demonstrations, failures, interventions, tactile, language annotations, non-robotics data, human intent, preference, or uncertainty) provides the most value for training general-purpose robot learning models?
How do the desiderata for dataset composition vary with different robot learning objectives (e.g., imitation learning vs. world modeling)?
Can we meaningfully define and measure important properties of robotics datasets, such as coverage, diversity, and quality?
What can we learn from dataset design in other ML domains, like vision and language? Can we formalize taxonomies for robotics dataset composition to promote a similar degree of reusability and comparison?
🧹 Theme 2: Data Curation – What data should we keep, drop, or collect next?
How can we evaluate the quality of robotics data? What makes a “good” example?
What are principled ways to select, filter, or weight data for different robot learning tasks?
Do robotics datasets contain harmful biases or spurious correlations? If so, how can we mitigate their effect?
Can we define common benchmarks for data curation in robot learning?
How do methods for active data collection (e.g., curriculum learning, data selection, adaptive sampling) scale to physical robots?
💡 Theme 3: Data Interpretability – How can we understand and analyze the role of data?
What tools exist (or are needed) to interpret how individual data points, demonstrations, or data subsets affect the behavior and generalization of robotics models?
Can interpretability guide what new data to collect for a deployed robot system?
How can interpretability inform design and tradeoffs in dataset composition, for example, using a few high-quality examples vs. large-scale, weakly-labeled data?
Marco Pavone is an associate professor at Stanford and the Director of Autonomous Vehicle Research at NVIDIA. His main research interests are in the development of methodologies for the analysis, design, and control of autonomous systems, with an emphasis on self-driving cars, autonomous aerospace vehicles, and future mobility systems. He is currently on partial leave from Stanford University, where he is an Associate Professor of Aeronautics and Astronautics. At Stanford, he is also the Director of the Autonomous Systems Laboratory and Co-Director of the Center for Automotive Research at Stanford. He received a Ph.D. degree in Aeronautics and Astronautics from the Massachusetts Institute of Technology in 2010. He is a recipient of a number of awards, including a Presidential Early Career Award for Scientists and Engineers from President Barack Obama, an Office of Naval Research Young Investigator Award, a National Science Foundation Early Career (CAREER) Award, a NASA Early Career Faculty Award, and an Early-Career Spotlight Award from the Robotics Science and Systems Foundation. He was identified by the American Society for Engineering Education (ASEE) as one of America's 20 most highly promising investigators under the age of 40. He is currently serving as an Associate Editor for the IEEE Control Systems Magazine.
Mayee Chen is a PhD candidate in Computer Science at Stanford University, advised by Professor Christopher Ré. Her research focuses on advancing the fundamentals of artificial intelligence through data-centric approaches, particularly in training data curation, where she has developed techniques for data mixing, curriculum learning, and weak supervision. Her work has been recognized with a best student paper runner-up award at UAI 2022, a best paper award at an AAAI 2022 workshop, and spotlights at ICLR and NeurIPS 2023. Mayee is currently a research intern at the Allen Institute for AI (AI2), driving the data mixing efforts for OLMo 3, their next open-source large language model. She has also interned at Microsoft Research and obtained her summa cum laude B.S.E. in Operations Research and Financial Engineering from Princeton University.
Joseph Lim is an Associate Professor in the Kim Jaechul School of Artificial Intelligence at Korea Advanced Institute of Science and Technology (KAIST). Previously, he was was an assistant professor at the University of Southern California (USC). Before that, he completed his PhD at Massachusetts Institute of Technology under the guidance of Professor Antonio Torralba, and also had a half-year long postdoc under Professor William Freeman and a year long postdoc under Professor Fei-Fei Li at Stanford University. He received his bachelor degree at the University of California - Berkeley, where he worked in the Computer Vision lab under the guidance of Professor Jitendra Malik. He has also spent time at Microsoft Research, Adobe Creative Technologies Lab, and Google.
Andreea Bobu is an Assistant Professor at MIT in AeroAstro and CSAIL. She leads the Collaborative Learning and Autonomy Research Lab (CLEAR Lab), where we develop autonomous agents that learn to do tasks for, with, and around people. Our goal is to ensure that these agents' behavior is consistent with human expectations, whether they interact with expert designers or novice users.
Andreea's work looks at: 1) getting the right data to supervise agents, whether directly from people or via priors; 2) enabling humans and robots to efficiently and interactively arrive at shared task representations; 3) quantifying and addressing misalignment caused by different human modeling choices. She grounds her work in experiments and user studies with AI systems like assistive robot arms or LLMs, and draw upon methods from deep learning, mathematical human modeling, inverse reinforcement learning, and Bayesian inference.
She obtained my Ph.D. in Electrical Engineering and Computer Science at UC Berkeley with Anca Dragan. Before MIT, she was also a Research Scientist at the AI Institute and an intern at NVIDIA in the Robotics Lab. Prior to my graduate degree, she received a B.S. in Computer Science at MIT.
Chelsea Finn is an Assistant Professor in Computer Science and Electrical Engineering at Stanford University, the William George and Ida Mary Hoover Faculty Fellow, and a co-founder of Physical Intelligence (Pi). Her research interests lie in the capability of robots and other agents to develop broadly intelligent behavior through learning and interaction. To this end, her work has pioneered end-to-end deep learning methods for vision-based robotic manipulation, meta-learning algorithms for few-shot learning, and approaches for scaling robot learning to broad datasets. Her research has been recognized by awards such as the Sloan Fellowship, the IEEE RAS Early Academic Career Award, and the ACM doctoral dissertation award, and has been covered by various media outlets including the New York Times, Wired, and Bloomberg. Prior to joining Stanford, she received her Bachelor's degree in Electrical Engineering and Computer Science at MIT and her PhD in Computer Science at UC Berkeley.
Max Simchowitz is an Assistant Professor in the Machine Learning Department at Carnegie Mellon University. Previously, he was a Simons-Berkeley Research Fellow, following a postdoctoral position in the Robot Locomotion Group at MIT, where he worked with Professor Russ Tedrake. He completed his PhD in Electrical Engineering and Computer Sciences at the University of California, Berkeley, advised by Professors Ben Recht and Michael Jordan. He began his academic career as a mathematics major at Princeton University, where he conducted research with Sanjeev Arora and David Blei.
Max’s research spans both the theoretical and practical aspects of machine learning, with a focus on sequential decision making, reinforcement learning, and the control of dynamical systems. He is particularly interested in how large-scale, generative AI models can transform robot learning, video prediction, and world modeling.
Masha Itkina is a Research Lead and Manager in the Large Behavior Model (LBM) division at the Toyota Research Institute (TRI). At TRI, she co-leads the Trustworthy Learning under Uncertainty (TLU) effort in the context of robotic manipulation. Her research focuses on policy evaluation, failure detection and mitigation, and active learning. Previously, she completed her PhD at the Stanford Intelligent Systems Lab (SISL) on uncertainty-aware perception for self-driving cars. Her work has been published in top-tier robotics and machine learning conferences, including RSS, CoRL, ICRA, IROS, and NeurIPS.
Christopher Agia
Stanford University
Joey Hejna
Stanford University
Rohan Sinha
Stanford University
Huihan Liu
UT Austin
Helen Wang
U Washington
Yuejiang Liu
Stanford University
Jack Collins
Collaborative Robotics
Mahi Shafiullah
NYU Courant
Jeannette Bohg
Stanford University
Kimin Lee
KAIST
Dorsa Sadigh
Stanford University
For any questions about the workshop, please email corldataws@gmail.com.