L-DOD @ RSS 2022

A grand challenge for robotics is generalization; to operate in unstructured real world environments we need household robots that can quickly learn to perform tasks in unseen kitchens, mobile manipulators and drones that can navigate novel spaces, and autonomous vehicles that can safely maneuver through unseen roads with varying conditions, all while minimizing dependence on humans. Recent breakthroughs in natural language processing and vision suggest that the secret to this level of generalization is data — not just the amount of data collected, but its diversity, and how to best leverage it while learning. Especially challenging is that most large-scale sources of data are collected offline, from varied sources. How do we use this diverse, offline data to build generalizable robotic systems?

Now seems to be the right time to think about this as a community. Recent years have seen an explosion of different large-scale datasets: from large manipulation datasets in static environments (QT-OPT, RoboNet) , to datasets more geared towards multi-task imitation learning (BC-Z, Bridge Dataset), rich autonomous driving datasets (nuScenes, BDD100K, Waymo Open Dataset), and diverse videos of humans performing tasks in the real world (20BN-Something-Something, Ego4D, HowTo100M, Epic-Kitchens). As a community, what we need to discuss is how to leverage these existing data sources, what additional data we may need, and how we can scale learning algorithms to consume this breadth of data in a manner conducive to deploying on real robotic systems.

Venue & Workshop Logistics

In-Person Venue: 545 Mudd @ Columbia University (Directions)

YouTube Live Stream Link (also via RSS Pheedloop): https://youtu.be/lHXp6j6YrY4
Virtual Poster Link: Gather.Town

For any questions – ldod_rss2022@googlegroups.com

Schedule
Day: Monday, June 27th (Additional talk & abstract details below)

[08:50 - 09:00] Opening Remarks
[09:00 - 09:45] Invited Talk - Sergey Levine (UC Berkeley, Google)
- - Title: New Behaviors From Old Data: How to Enable Robots to Learn New Skills from Diverse, Suboptimal Datasets
[09:45 - 10:30] Invited Talk - Eric Jang (Halodi Robotics)
- - Title: Iterating on General-Purpose Robots at Scale
[10:30 - 11:00] Virtual Poster Session & Coffee Break - on Gather.Town
[11:00 - 11:45] Invited Talk - Abhinav Gupta (Carnegie Mellon University, Meta AI)
- - Title: Watch, Learn and Manipulate: Learning Manipulation in the Wild from Passive Videos
[11:45 - 12:30] Full Panel - Sergey, Eric, Abhinav, Davide, Cathy, Benjamin - moderated by Chelsea Finn, Dorsa Sadigh
[12:30 - 13:30] Lunch
[13:30 - 14:15] Invited Talk - Davide Scaramuzza (University of Zurich)
- - Title: Learning, Agile, Vision-based Drone Flight: from Simulation to Reality
[14:15 - 15:00] Invited Talk - Kristen Grauman (UT Austin, Meta AI)
- - Title: From First-Person Video to Agent Action
[15:00 - 15:30] In-Person Poster Session & Coffee Break - in the CS Lounge
[15:30 - 16:15] Invited Talk - Cathy Wu (MIT)
- - Title: Cities as Robots: Scalability, Operations, and Robustness
[16:15 - 17:00] Invited Talk - Benjamin Sapp (Waymo)
- - Title: Multi-agent Behavior Modeling for Autonomous Driving: Models, Representations, and Data
[17:00 - 17:15] 3 x 5m Paper Spotlight Talks
[17:15 - 17:30] Awards & Closing Remarks

Speakers

Kristen Grauman
(UT Austin, Meta AI)

Abhinav Gupta
(CMU, Meta AI)

Cathy Wu
(MIT)

Benjamin Sapp
(Waymo)

Eric Jang
(Halodi Robotics)

Sergey Levine
(UC Berkeley, Google)

Davide Scaramuzza
(University of Zurich)

Talks & Events

Sergey Levine – New Behaviors From Old Data: How to Enable Robots to Learn New Skills from Diverse, Suboptimal Datasets

Abstract: Large, diverse datasets form the backbone of modern machine learning. However, while most machine learning methods are concerned with learning to reproduce the distribution seen in the data, robotic skill learning typically requires learning skills that do something more effective than what was seen in the data. In this talk, I will discuss how algorithms that involve offline reinforcement learning, planning, and representation learning can enable this capability, utilizing diverse offline data to extract robotic skills that can do more than what was seen in the data. I will discuss algorithmic foundations and experimental results in robotic navigation and object manipulation.

Eric Jang – Iterating on General-Purpose Robots at Scale

Abstract: “Learning” is not just about gathering a big diverse dataset and taking a million gradient steps. “Learning” is also the process of evaluation and iterating on ideas until the machine learning system does what you want. I’m going to present some back-of-the-envelope calculations that contrast how the speed of iteration in robotics is still orders of magnitude slower than other areas of machine learning research. I’ll cover some of the work I did at Google in service of efficient evaluation and iteration of end-to-end robotic learning systems, and share some lessons learned in scaling up these systems carefully, and how we might try to speed up the robotics research process by taking inspiration from other areas of ML research. Finally, I’ll share a sneak preview of some of the things I’m working on at Halodi Robotics.

Abhinav Gupta – Watch, Learn and Manipulate: Learning Manipulation in the Wild from Passive Videos

Full Panel Discussion – Sergey, Eric, Abhinav, Davide, Cathy, and Benjamin moderated by Chelsea Finn & Dorsa Sadigh.

Davide Scaramuzza – Learning, Agile, Vision-based Drone Flight: from Simulation to Reality

Abstract: I will summarize our latest research in learning deep sensorimotor policies for agile vision-based quadrotor flight. Learning sensorimotor policies represents a holistic approach that is more resilient to noisy sensory observations and imperfect world models. However, training robust policies requires a large amount of data. I will show that simulation data is enough to train policies that transfer to the real world without fine-tuning. We achieve one-shot sim-to-real transfer through the appropriate abstraction of sensory observations and control commands. I will show that these learned policies enable autonomous quadrotors to fly faster and more robustly than before, using only onboard cameras and computation. Applications include acrobatics, high-speed navigation in the wild, and autonomous drone racing.

Kristen Grauman – From First-Person Video to Agent Action

Abstract: First-person or “egocentric” perception requires understanding the video that streams to a wearable camera. It offers a special window into the camera wearer’s attention, goals, and interactions, making it an exciting avenue for robot learning from offline human-captured data. I will present our recent progress using passive observations of human activity to inform active robot behaviors, such as learning effective hand poses and object affordances from video to shape dexterous robot manipulation, or discovering compatible objects to shortcut visual semantic planning. We show how reinforcement learning agents that prefer human-like interactions can successfully accelerate their task learning and generalization. Finally, I will overview Ego4D, a massive new egocentric video dataset and benchmark built by a multi-institution collaboration that offers a glimpse of daily life activity around the world.

Cathy Wu – Cities as Robots: Scalability, Operations, and Robustness

Abstract: Cities are central to today's sustainability challenges, including public health and safety, environmental impacts, and equity and access. At the same time, cities are becoming more like robots, with increasingly pervasive sensing and new forms of actuation. From a lens of robotics, machine learning, and transportation engineering, there is a once-in-a-generation opportunity to learn effective interventions to move the needle on long-standing societal challenges. However, urban settings are massively multi-agent, safety-critical yet impossible to model perfectly, and highly varied. This talk focuses on our recent work in addressing the scalability of learning methods in urban settings, the scalability of robotic operations in safety-critical environments, and the robustness of learning methods to environmental diversity.

Benjamin Sapp – Multi-agent Behavior Modeling for Autonomous Driving: Models, Representations, and Data

Abstract: In this talk we focus on behavior modeling for autonomous driving, which entails figuring out where multiple agents in the scene will go next. Such modeling is crucial for safe and efficient driving. We will go over our recent work in 3 key dimensions to this problem: models, representations and data. First, we present a new, scalable family of transformer-based deep learning models that achieve SOTA on public benchmarks. Second, we propose a new factored output representation that captures the joint probabilities of pairs of agents interacting, and show it significantly improves the consistency of agents' predicted futures. Last, we discuss the need for simulated data for planning, and present our behavior simulator that synthesizes scenarios that are both diverse and realistic.

Spotlight Talks

1. Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?
Authors: Yuchen Cui, Scott Niekum, Abhinav Gupta, Vikash Kumar, and Aravind Rajeswaran.

2. Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
Authors: Cong Lu, Philip J. Ball, Tim G. J. Rudner, Jack Parker-Holder, Michael A Osborne, and Yee Whye Teh.

3. How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning vis f-Advantage Regression
Authors: Yecheng Jason Ma, Jason Yan, Dinesh Jayaraman, and Osbert Bastani.

Fireside Chats

Sigmund Høeg / Sergey Levine

sigmund_sergey_chat.mp4

Peter Fagan / Eric Jang

peter_eric_chat.mp4

Accepted Papers

Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation? (Spotlight)
Yuchen Cui, Scott Niekum, Abhinav Gupta, Vikash Kumar, Aravind Rajeswaran.
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations. (Spotlight) - Outstanding Paper Awardee
Cong Lu, Philip J. Ball, Tim G. J. Rudner, Jack Parker-Holder, Michael A Osborne, Yee Whye Teh.
How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning vis $f$-Advantage Regression. (Spotlight)
Yecheng Jason Ma, Jason Yan, Dinesh Jayaraman, Osbert Bastani.
Offline RL at Multiple Frequencies.
Kaylee Burns, Tianhe Yu, Chelsea Finn, Karol Hausman.
Learning from Imperfect Demonstrations via Adversarial Confidence Transfer.
Zhangjie Cao, Zihan Wang, Dorsa Sadigh.
A Dataset and Benchmark for Learning the Kinematics of Concentric Tube Continuum Robots.
Reinhard M Grassmann, Ryan Zeyuan Chen, Nan Liang, Jessica Burgner-Kahrs.
Behavior Transformers: Cloning $k$ modes with one stone.
Nur Muhammad Mahi Shafiullah, Zichen Jeff Cui, Ariuntuya Altanzaya, Lerrel Pinto.
Robotic Telekinesis: Learning a Robotic Hand Imitator by Watching Humans on YouTube.
Aravind Sivakumar, Kenneth Shaw, Deepak Pathak.
Latent-Variable Advantage-Weighted Policy Optimization for Offline Reinforcement Learning.
Xi Chen, Ali Ghadirzadeh, Tianhe Yu, Yuan Gao, Jianhao Wang, Wenzhe Li, Bin Liang, Chelsea Finn, Chongjie Zhang.
Watch and Match: Supercharging Imitation with Regularized Optimal Transport.
Siddhant Haldar, Denis Yarats, Lerrel Pinto.
LISA: Learning Interpretable Skill Abstractions from Language.
Divyansh Garg, Skanda Vaidyanath, Kuno Kim, Jiaming Song, Stefano Ermon.
The (Un)Surprising Effectiveness of Pre-Trained Vision Models for Control.
Simone Parisi, Aravind Rajeswaran, Senthil Purushwalkam, Abhinav Gupta.
Dexterous Imitation Made Easy: A Learning-Based Framework for Efficient Dexterous Manipulation.
Sridhar Pandian Arunachalam, Sneha Silwal, Ben Evans, Lerrel Pinto.
A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search.
Brandon Trabucco, Gunnar A Sigurdsson, Robinson Piramuthu, Gaurav S. Sukhatme, Ruslan Salakhutdinov.
Learning Action and State Sampling Distributions From Offline Data for Human-Robot Collaboration.
Eley Ng, Ziang Liu, Monroe David Kennedy.
Reinforcement Learning with Neural Radiance Fields.
Danny Driess, Ingmar Schubert, Pete Florence, Yunzhu Li, Marc Toussaint.
RRL: Resnet as representation for Reinforcement Learning.
Rutav Shah, Vikash Kumar.
RoboTube: Learning Household Manipulation from Human Videos with Simulated Twin Environments.
Haoyu Xiong, Haoyuan Fu, Jieyi Zhang, Qiang Zhang, Chen Bao, Yongxi Huang, Wenqiang Xu, Animesh Garg, Huazhe Xu, Cewu Lu.
Robots Enact Malignant Stereotypes.
Andrew Hundt, William Agnew, Vicky Zeng, Severin Kacianka, Matthew Gombolay.
Don’t Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning.
Homer Rich Walke, Jonathan Heewon Yang, Albert Yu, Aviral Kumar, Jędrzej Orbik, Avi Singh, Sergey Levine.

Call for Papers

L-DOD @ RSS 2022 seeks high-quality research papers that introduce new ideas and stimulate future trends in robotics and machine learning. We invite submissions in all areas of data-driven robot learning and machine learning, including but not limited to:

Diverse dataset collection, benchmarking, offline RL, imitation learning, transfer learning, pretraining/finetuning, cross-embodiment learning, generalization challenges and real-world applications requiring diverse datasets such as manipulation, navigation, and autonomous driving.

Note: These topics are not exhaustive! If you feel your work fits with the spirit of this workshop, we heartily encourage you to submit!

Submission Guidelines

~~Submission Portal:~~ ~~https://openreview.net/group?id=roboticsfoundation.org/RSS/2022/Workshop/L-DOD_Regular_Deadline~~
~~Updated Camera-Ready Instructions~~: ~~Please submit the final version of your paper to the same OpenReview link for your original paper submission. We will link to the OpenReview PDF from this website!~~
Paper Format: Submissions can follow the template recommended by any major conference in robotics and machine learning (RSS/CoRL/NeurIPS/ICML/ICRA/…). We are not enforcing a strict template and the authors are free to choose a template of choice.
Double Blind Review: All submissions to the workshop must be properly anonymized for double-blind review. Any author information or information that may otherwise identify author identities should be removed.
Page Limit: Following the RSS 2022 guidelines, there are no page limits on papers. Articles generally contain 4-8 pages of text, excluding references but this is not a hard constraint. However, for extended abstracts, we will enforce a 4-page limit (main body, unlimited references and appendices).
Dual Submission:
- - Papers to be submitted or in preparation for other archival venues (e.g. IROS, CoRL, NeurIPS, ISRR etc.) in the field are allowed (as long as this is OK with the archival venue in question!)
  - We also welcome recently accepted or published works (e.g. RSS, ICML), but this must be explicitly declared at the time of submission.
Visibility: Submissions and reviews will not be public. Only accepted papers will be made public via the workshop’s OpenReview page.
This workshop is a non-archival venue and will not have official proceedings. Workshop submissions can be subsequently or concurrently submitted to other venues.
For questions or concerns regarding the peer review process, please contact the workshop organizers at ldod_rss2022@googlegroups.com

Important Dates

Announcement & Call for Submissions: April 5th, 2022
~~Early~~ ~~Paper Submission Deadline:~~ ~~May 2nd, 2022~~ (~~AoE) – for reviews/decision by May 13th, 2022~~
~~Regular Paper Submission Deadline:~~ ~~May 20th, 2022 (AoE) – for reviews/decision by June 11th, 2022~~
~~Note – Extended Deadline:~~ ~~May 27th, 2022 (AoE) – for reviews/decision by June 11th, 2022.~~
~~Camera-Ready Deadline:~~ ~~June 21, 2022 (AoE) – see updated instructions above (submit final version to OpenReview as well)~~
Workshop @ RSS 2022: June 27, 2022 (All Day, Eastern Time)

Note: In order to facilitate travel to RSS (if you wish to present physically), we have two submission deadlines with two decision dates; there is no functional difference between the two deadlines besides the decision date; knowing your paper's status earlier will hopefully allow for cheaper travel booking. Separately, we are still dedicated to our hybrid attendance option, for those of you who cannot make it to NYC!

Respondents & Fireside Chats

As submissions come in, we'll be posting the various accepted extended abstracts, long papers, and abstracts for the talks of various speakers. We encourage workshop attendees to post questions, and engage in discussion ahead of (and after) the workshop!

To explicitly encourage hearing from early-stage PhD students, researchers, or others that may not necessarily engage in large-workshop settings, we are soliciting volunteers to act as "respondents" for each speaker. A respondent will not only help seed the discussion questions & help curate and ask questions the day of the workshop, but will also have the opportunity to have a 30 minute, 1-1 "Fireside Chat" with their assigned speaker, to be recorded ahead of the workshop.

If you'd like to volunteer to be a respondent, please email ldod_rss2022@googlegroups.com with a bit about yourself and the speaker you'd like to chat with by May 20th (AoE)!

We especially encourage diverse or underrepresented students and researchers to reach out!

Program Committee

We thank the following members of the community for volunteering to serve as reviewers for our workshop:

Erdem Biyik
Kaylee Burns
Zhangjie Cao
Annie Chen
Yuchen Cui
Sudeep Dasari
Elia Kaufmann
Ilya Kostrikov
Raghuram M

Jason Ma
Ajay Mandelkar
Ashvin Nair
Mahi Shafiullah
Pratyusha Sharma
Stephen Tian
Brandon Trabucco
Michael Wray
Ted Xiao

Organizers

Siddharth Karamcheti
(Stanford)

Suraj Nair
(Stanford)

Dhruv Shah
(UC Berkeley)

Victoria Dean
(CMU)

Chelsea Finn
(Stanford)

Percy Liang
(Stanford)

Dorsa Sadigh
(Stanford)

Code of Conduct

L-DOD 2022 is organized for the purpose of open exchange of ideas, the freedom of thought and expression, to engage in productive debates, create professional connections, and learn about exciting research.

L-DOD 2022 is committed to ensuring all participants have a positive experience at the symposium. All participants have the equal right to pursue shared interests without harassment or discrimination in an environment that supports diversity and inclusion. L-DOD will not tolerate any harassment, discrimination, personal attacks, disruption, or bullying. Participants who are asked to stop such behavior are expected to do so immediately.

If you have concerns about a participant's behavior, please reach out to ldod_rss2022@googlegroups.com or contact any of the organizers. We will respond as soon as possible.

Google Sites

Report abuse

Venue & Workshop Logistics

ScheduleDay: Monday, June 27th (Additional talk & abstract details below)

Speakers

Kristen Grauman (UT Austin, Meta AI)

Abhinav Gupta(CMU, Meta AI)

Cathy Wu(MIT)

Benjamin Sapp(Waymo)

Eric Jang(Halodi Robotics)

Sergey Levine(UC Berkeley, Google)

Davide Scaramuzza(University of Zurich)