Robot Evaluation for the Real World

Workshop at Robotics: Science and Systems (RSS) 2025

June 25, 2025

Los Angeles, California

RTH (Ronald Tutor Hall) 211 for Talks/Debate/Panel

Epstein Plaza for Poster Session

Workshop Overview

Recording Information

Speakers

Panelists

Organizers

Workshop Overview

Robotics has seen rapid advancements, with emerging methods and systems achieving impressive results on public benchmarks. Yet, despite these strides, real-world deployment continues to expose persistent challenges in generalization, robustness, safety, and reliability. This workshop aims to critically examine how we evaluate robotic systems and whether current benchmarking practices effectively reflect the complexities of real-world operation across domains like factories, construction sites, and homes.

To ensure the reliability and accountability of robotic systems while maintaining scientific momentum, we propose a fresh look at evaluation practices. Are existing benchmarks sufficient, or should they be redesigned to better capture real-world behavior? Can we provide theoretical guarantees of safety and performance, or must we rely on comprehensive empirical assessments? How do we balance thoroughness in evaluation with accessibility and replicability, especially across diverse research environments?

This workshop brings together experts from robotics, machine learning, human-robot interaction, cognitive science, and related fields to discuss these pressing questions. We organize the challenges into three key themes:

Evaluations and Progress: How can evaluations drive meaningful advancements without creating barriers that stifle innovation?
Accessibility and Relevance: Should benchmarks reflect the full complexity of deployment, or be simplified to foster broader participation?
Alignment Across Stakeholders: How do we design evaluations that meet the needs of academia, industry, and policy, without compromising on rigor or utility?

By exploring both the technical and societal implications of evaluation, we aim to develop frameworks that support not only scientific discovery but also safe and impactful real-world deployment. Through collaborative discussion, we hope to lay the groundwork for rigorous, transparent, and flexible evaluation methodologies that guide the next generation of robotics research.

Recording Information

Morning recording

Link: https://cmu.zoom.us/rec/share/kNxw41gEHLHAsYGQ6UQsrYMEYDa4dEkWinIZFmrYEqlrKAHeFydmykOaYm_W8kBA.4LimyR-UhgBFiUKo

Passcode: 4$3$gvG4

Afternoon recording

Link: https://cmu.zoom.us/rec/share/203iXrkj7CCi8NmVTlsy6WzHsklhK8zoOIY-u0Z20hvBA3lTx1_My8L-Z7AWOHJK.aHRcWiLWALKESjgK

Passcode: 4X*qE%A3

Zoom information

https://cmu.zoom.us/j/94467599166?pwd=YEvC4w1KbiWtK7jBWb0XYSIwRuFp9o.1

Meeting ID: 944 6759 9166

Passcode: 517466

Speakers

Andrea Bajcsy

Assistant Professor, Carnegie Mellon University

Andrea Bajcsy is an Assistant Professor in the Robotics Institute at Carnegie Mellon University where she leads the Interactive and Trustworthy Robotics Lab (Intent Lab). She broadly works at the intersection of robotics, machine learning, control theory, and human-AI interaction. Prior to joining CMU, Andrea received her Ph.D. in Electrical Engineering & Computer Science from University of California, Berkeley in 2022. She is the recipient of the NSF CAREER Award (2025), Google Research Scholar Award (2024), Rising Stars in EECS Award (2021), Honorable Mention for the T-RO Best Paper Award (2020), NSF Graduate Research Fellowship (2016), and worked at NVIDIA Research for Autonomous Driving.

Anirudha Majumdar

Associate Professor, Princeton University

Anirudha Majumdar is an Associate Professor at Princeton’s Mechanical and Aerospace Engineering department and a Research Scientist at Google DeepMind. His research focuses on control algorithms for high-performance, safety-critical robotics. He earned his Ph.D. at MIT.

Elena Messina

Prospicience LLC

Elena Messina is Principal at Prospicience LLC, where she provides consulting for strategic planning and on development, assessment, and adoption of advanced robotics and AI technologies. Previously, she founded and led major research programs and projects at the National Institute of Standards and Technology that focused on advancing the capabilities of robots through the definition of performance requirements, metrics, test methods, tools, and testbeds.

Juha Röning

Professor, University of Oulu

Prof. Juha Röning, is the head of the Biomimetics and Intelligent Systems Group (BISG) research unit and a professor at the Faculty of Information Technology and Electrical Engineering at the University of Oulu. He has more than 30 years of experience in mobile robotics, holds three patents, and has published more than 400 papers in the areas of computer vision, robotics, intelligent signal analysis, and software security. He is currently serving as a Board of Director for euRobotics aisbl (Vice-President Research) and Adra. He was the academic coordinator for DIMECC CyberTrust programme and the project coordinator for H2020 HYFLIERS and CS-AWARE projects. He is currently the project coordinator of CS-AWARE-NEXT project (Horizon Europe).

Lindsay Sanneman

Assistant Professor, Arizona State University

Lindsay Sanneman is an Assistant Professor at the School of Computing and Augmented Intelligence at ASU. Her research focuses on model evaluation and alignment from human factors and human-robot interaction perspectives. She received her Ph.D. from MIT.

Vincent Vanhoucke

Distinguished Engineer, Waymo

Vincent Vanhoucke is a Distinguished Engineer at Waymo, focusing on AI and machine learning for robotics. He was a founding member of Google Brain and led Google’s robotics research team. His work includes large-scale deep learning systems for speech and vision, such as the 'Inception' architectures. He holds a Ph.D. from Stanford and is an IEEE Fellow.

Yukie Nagai

Project Professor, University of Tokyo

Yukie Nagai is a Project Professor at the International Research Center for Neurointelligence at the University of Tokyo. Her research spans computational neuroscience and cognitive developmental robotics. She received her Ph.D. from Osaka University.

Panelists

Ted Xiao

Google Deepmind

Ted Xiao is a research scientist at Google DeepMind, where he works on making robots smarter. His research focuses on robot learning, internet-scale foundation models, and reinforcement learning.

Henrik Christensen

Professor, UC San Diego

Dr. Henrik I. Christensen is the Qualcomm Chancellor’s Chair of Robot Systems and a Distinguished Professor of Computer Science at Dept. of Computer Science and Engineering, UC San Diego. He is the director of the Contextual Robotics Institute, the Cognitive Robotics Laboratory, and the Autonomous Vehicle Laboratory.