Schedule:
08:45am - 09:00am Opening Remarks
09:00am - 09:30am Keynote: Ludwig Schmidt -- A Data-Centric View on Reliable Generalization
09:30am - 10:00am Keynote: Dorsa Sadigh -- A Path Towards Generalist Policies
10:00am - 10:30am Oral Paper Talks, Session #1 (3 x 10 minutes)
10:30am - 11:00am Coffee Break
11:00am - 11:30am Keynote: Andrea Bajcsy -- How Should Robots Handle Out-of-Distribution Conditions?
11:30am - 12:00pm Keynote: Huazhe Xu -- Robot OOD Generalization Does Not Exist
12:00pm - 12:30pm Oral Paper Talks, Session #2 (3 x 10 minutes)
12:30pm - 02:00pm Lunch Break
02:00pm - 02:30pm Keynote: Masha Itkina -- Evaluation and Uncertainty in the Age of Robot Learning
02:30pm - 03:10pm Lightning Talks
03:10pm - 04:00pm Poster Session (note: the poster session is in Epstein Plaza)
04:00pm - 04:30pm Coffee Break
04:30pm - 05:00pm Keynote: Anirudha Majumdar -- Predictive Red Teaming: Breaking Policies Without Breaking Robots
05:00pm - 05:45pm Panel Session
05:45pm - 06:00pm Closing Remarks and Awards Ceremony
Note: all times are local Pacific Time
Location:
Main Workshop Room: EEB 132
Poster Room: Epstein Plaza
Hughes Aircraft Electrical Engineering Center
3740 McClintock Ave, Los Angeles, CA 90089
Title: A Data-Centric View on Reliable Generalization
Abstract: Researchers have proposed many methods to make neural networks more reliable under distribution shift, yet there is still large room for improvement. Are better training algorithms or training data the more promising way forward? In this talk, we study this question in the context of OpenAI’s CLIP model for learning from image-text data.
First, we survey the ImageNet robustness landscape based on a large-scale experimental study involving more than 200 different models and test conditions. The CLIP models stand out with unprecedented robustness on multiple challenging distribution shifts. We then investigate the cause of CLIP’s robustness via controlled experiments to disentangle the influence of language supervision and training distribution. While CLIP leveraged large scale language supervision for the first time, its robustness actually comes from the pre-training dataset.
We conclude with an overview of research on pre-training datasets: LAION-5B, the largest public image-text dataset, and experiments to further improve pre-training data (DataComp).
Title: A Path Towards Generalist Policies
Title: How Should Robots Handle Out-of-Distribution Conditions?
Abstract: Out-of-distribution (OOD) detection is becoming increasingly important in robotics, as real-world environments inevitably present conditions beyond what a robot has seen during training. While recent work has made progress on identifying when a robot encounters an OOD condition, detection alone is not enough: robots must also know how to respond when these situations arise. In this talk, I will share my group’s recent efforts toward not just detecting, but also mitigating robot failures that OOD conditions can cause.
Title: Robot OOD Generalization Does Not Exist
Title: Evaluation and Uncertainty in the Age of Robot Learning
Abstract: Large-scale robot learning, colloquially known as Large Behavior Models (LBMs), Embodied Foundation Models (EFMs), or Vision-Language-Action (VLA) models, has become increasingly the norm in robot learning literature since the success of ChatGPT. Nevertheless, there are many questions that remain surrounding their development in the context of real-world, embodied systems. For example, we need rigorous statistical methodologies for evaluation and comparison of existing robot learning models. To deploy these models in human environments, they should be equipped with reliable failure detection systems despite the challenge of immeasurable failure types and conditions during deployment. Lastly, these models should have the capacity to explore and adapt to new environments, preferences, and tasks. In this talk, I will overview our work as part of the Trustworthy Learning under Uncertainty (TLU) effort at TRI along a few of these research directions, focussing on evaluation and failure detection.
Title: Predictive Red Teaming: Breaking Policies Without Breaking Robots