Sergey Levine – New Behaviors From Old Data: How to Enable Robots to Learn New Skills from Diverse, Suboptimal Datasets
Abstract: Large, diverse datasets form the backbone of modern machine learning. However, while most machine learning methods are concerned with learning to reproduce the distribution seen in the data, robotic skill learning typically requires learning skills that do something more effective than what was seen in the data. In this talk, I will discuss how algorithms that involve offline reinforcement learning, planning, and representation learning can enable this capability, utilizing diverse offline data to extract robotic skills that can do more than what was seen in the data. I will discuss algorithmic foundations and experimental results in robotic navigation and object manipulation.
Eric Jang – Iterating on General-Purpose Robots at Scale
Abstract: “Learning” is not just about gathering a big diverse dataset and taking a million gradient steps. “Learning” is also the process of evaluation and iterating on ideas until the machine learning system does what you want. I’m going to present some back-of-the-envelope calculations that contrast how the speed of iteration in robotics is still orders of magnitude slower than other areas of machine learning research. I’ll cover some of the work I did at Google in service of efficient evaluation and iteration of end-to-end robotic learning systems, and share some lessons learned in scaling up these systems carefully, and how we might try to speed up the robotics research process by taking inspiration from other areas of ML research. Finally, I’ll share a sneak preview of some of the things I’m working on at Halodi Robotics.
Abhinav Gupta – Watch, Learn and Manipulate: Learning Manipulation in the Wild from Passive Videos
Full Panel Discussion – Sergey, Eric, Abhinav, Davide, Cathy, and Benjamin moderated by Chelsea Finn & Dorsa Sadigh.
Davide Scaramuzza – Learning, Agile, Vision-based Drone Flight: from Simulation to Reality
Abstract: I will summarize our latest research in learning deep sensorimotor policies for agile vision-based quadrotor flight. Learning sensorimotor policies represents a holistic approach that is more resilient to noisy sensory observations and imperfect world models. However, training robust policies requires a large amount of data. I will show that simulation data is enough to train policies that transfer to the real world without fine-tuning. We achieve one-shot sim-to-real transfer through the appropriate abstraction of sensory observations and control commands. I will show that these learned policies enable autonomous quadrotors to fly faster and more robustly than before, using only onboard cameras and computation. Applications include acrobatics, high-speed navigation in the wild, and autonomous drone racing.
Kristen Grauman – From First-Person Video to Agent Action
Abstract: First-person or “egocentric” perception requires understanding the video that streams to a wearable camera. It offers a special window into the camera wearer’s attention, goals, and interactions, making it an exciting avenue for robot learning from offline human-captured data. I will present our recent progress using passive observations of human activity to inform active robot behaviors, such as learning effective hand poses and object affordances from video to shape dexterous robot manipulation, or discovering compatible objects to shortcut visual semantic planning. We show how reinforcement learning agents that prefer human-like interactions can successfully accelerate their task learning and generalization. Finally, I will overview Ego4D, a massive new egocentric video dataset and benchmark built by a multi-institution collaboration that offers a glimpse of daily life activity around the world.
Cathy Wu – Cities as Robots: Scalability, Operations, and Robustness
Abstract: Cities are central to today's sustainability challenges, including public health and safety, environmental impacts, and equity and access. At the same time, cities are becoming more like robots, with increasingly pervasive sensing and new forms of actuation. From a lens of robotics, machine learning, and transportation engineering, there is a once-in-a-generation opportunity to learn effective interventions to move the needle on long-standing societal challenges. However, urban settings are massively multi-agent, safety-critical yet impossible to model perfectly, and highly varied. This talk focuses on our recent work in addressing the scalability of learning methods in urban settings, the scalability of robotic operations in safety-critical environments, and the robustness of learning methods to environmental diversity.
Benjamin Sapp – Multi-agent Behavior Modeling for Autonomous Driving: Models, Representations, and Data
Abstract: In this talk we focus on behavior modeling for autonomous driving, which entails figuring out where multiple agents in the scene will go next. Such modeling is crucial for safe and efficient driving. We will go over our recent work in 3 key dimensions to this problem: models, representations and data. First, we present a new, scalable family of transformer-based deep learning models that achieve SOTA on public benchmarks. Second, we propose a new factored output representation that captures the joint probabilities of pairs of agents interacting, and show it significantly improves the consistency of agents' predicted futures. Last, we discuss the need for simulated data for planning, and present our behavior simulator that synthesizes scenarios that are both diverse and realistic.
Spotlight Talks
1. Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?
Authors: Yuchen Cui, Scott Niekum, Abhinav Gupta, Vikash Kumar, and Aravind Rajeswaran.
2. Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
Authors: Cong Lu, Philip J. Ball, Tim G. J. Rudner, Jack Parker-Holder, Michael A Osborne, and Yee Whye Teh.
3. How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning vis f-Advantage Regression
Authors: Yecheng Jason Ma, Jason Yan, Dinesh Jayaraman, and Osbert Bastani.