Generative Modeling meets HRI

GenAI-HRI

RSS 2024 Workshop

Jul 15, 2024 @ Delft, Netherlands

Photo Credit

Location: (Room ME B - Newton) https://map.tudelftcampus.nl/nl/poi/mechanical-engineering-me/

July 15, 8:45am-5pm

News:

06/26: Sent out paper notifications!

04/15: Call for papers is out! Submission deadline May 28th, AoE New submission deadline June 14th, AoE

In recent years, the rapid evolution of generative modeling, encompassing Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and Diffusion Models, among others, has reshaped the landscape of image synthesis, enabling the generation of highly realistic and diverse visual content. Inspired by this progress, our workshop proposal seeks to explore the potential of applying generative modeling techniques to enhance human-robot interaction (HRI). We aim to gather the communities of robot learning and Human-Robot Interaction to discuss cutting-edge generative model techniques, modeling of human behaviors and robot motions, and opportunities to use them to achieve more intuitive human-robot interactions for robot teaching, data collection, language-base interactions and collaborative execution.

Why are generative models important for research in HRI? HRI will benefit greatly from powerful large models that bring open-world knowledge and generalization to the classic HRI interaction workflows. Just as ChatGPT has become of popular use for non-technical users, it’s only a matter of time before these types of large models with vision and language capabilities will play a key role in generating and mediating interaction between humans and robots in daily life settings (home robots learning your home tasks from examples) and industrial deployments (co-bots in manufacturing). Generative models are also key for the creation of simulation environments (3D assets, scenes, tasks, language commands, and language-based task generation), and simulation environments are useful for data collection of human demonstrations, data generation, and policy training. It’s important for HRI researchers to foster collaborations that investigate how multi-agent interactions and human-like behaviors will play a role in these systems, whether in simulation or real settings.

Why is HRI important to research in generative models? Conversely, HRI is pivotal for advancing research in generative models. Human interaction and feedback are essential for producing high-quality data for learning and value-aligned training. For example, reinforcement learning from human feedback (RLHF) has demonstrated significant advancements in model performance, enabling ChatGPT’s performance to surpass models learned from static language datasets. Generative models applied to robotics are fundamentally tied to human interaction. In data collection pipelines, we need to provide users with tools, methods, and interfaces to provide and curate high-quality data that can be used by learning algorithms. For model improvements, we need human feedback in the loop with policy learning iterations of fine-tuning during deployment. These are all core interaction problems that are studied in HRI and are now prime to be used in the loop with generative AI in both training and inference, bringing the knowledge from interactions and human-centered modeling into robot learning.

Topics

Motion and Behavior Modelling and Generation

Generative modeling of human-like behaviors in imitation learning
Generative modeling of valid robot motion plans and TAMP
Generative modeling of human-robot interactions
Imitation learning and learning from demonstrations (motion and tasks)
Imitation of multi-agent collaborative tasks
Diffusion Models for motion and behavior generation
Generation of scenes, tasks, and interactive behaviors in simulation

Human Interaction for Goal Specification

Interfaces for robot teaching
Teleoperation and shared autonomy
User goal specification for interactively commanding a robot

Large Language Models (LLMs) and Vision Language Models (VLMs) in HRI

LLMs and VLMs for embodied AI
Generative models (LLMs/VLMs) for offline evaluation
Generative models of speech for HRI (dialogue, empathy, engagement)
LLMs as planners for behavior generation.

AI-HRI Safety and Alignment

Risks and biases of using generative models for data generation, interaction
Safely deploying generative models for HRI
Out-of-distribution settings in HRI

Speakers and Panelists

Yilun Du

MIT

Andrea Bajcsy

CMU

Ben Burchfiel

TRI

Siyuan Feng

TRI

Ted Xiao

Google DeepMind

Tesca Fitzgerald

Yale

Sammy Christen

ETH

Danica Kragic

KTH

Jens Kober

TU Delft

Maya Cakmak

Agenda

Location: (Room ME B - Newton) https://map.tudelftcampus.nl/nl/poi/mechanical-engineering-me/

8:45 Workshop Intro

Morning Session. Chair: Georgia Chalvatzaki

9:00 Andrea Bajcsy - Towards Human—AI Safety: Unifying Generative AI and Control Systems Safety
9:30 Danica Kragic - Integrating LLMs and Diffusion Policies for Skill Learning
10:00 Coffee Break
10:30 Ben Burchfiel & Siyuan Feng - Large Behavior Models: Challenges and Opportunities
11:00 Ted Xiao - What’s Missing for Robot Foundation Models?
11:30 Felix Wang - Conditional Motion Generation through Online Physical Interaction
12:00 Panel Discussion 1, moderated by Claudia D'Arpino:

Andrea Bajcsy, Ben Burchfiel, Ted Xiao, Jens Kober

12:30 Lunch

Afternoon Session. Chair: Nadia Figueroa

2:00 Tesca Fitzgerald - Toward Comprehensive Models for Interpreting Human Feedback
2:30 Sammy Christen - Modeling and Enhancing Human-Robot Interactions: From Hand-Object Motion Generation to Vision-Based Human-to-Robot Handovers
3:00 1-min Lightening Talks --> Transition to Poster Session
3:30 Coffee Break/Poster Session
4:00 Yilun Du - Constructing Customizable Generative Models through Compositional Generation
4:30 Panel Discussion 2, moderated by Sidd Karamcheti:

Tesca Fitzgerald, Yilun Du, Maya Cakmak, Nadia Figueroa

5:00 Best Paper Award Announcement

Call For Papers

Authors are invited to submit short papers (3-4 pages excluding references) covering topics on generative modeling applied to human-robot interaction. We invite contributions describing on-going research, results that build on previously presented work, systems, datasets and benchmarks, and papers with demos (that could be displayed easily next to a poster).

Submission link https://openreview.net/group?id=roboticsfoundation.org/RSS/2024/Workshop/GenAI-HRI

Submission

Submissions should use the official RSS LaTeX template. Reviews will be single blind. Accepted papers will be presented as posters during the workshop and selected works will have an opportunity for a spotlight talk. Accepted papers will be available online on the workshop website (non-archival). A best paper award will be sponsored by NVIDIA.

Important Dates

Submission deadline: June 14th, 2024 (Anywhere on Earth [AoE])

Notifications: June 26th, 2024 (AoE)

Camera-ready deadline: July 8th, 2024 (AoE) - Submit revision in OpenReview + NEW! 1min videos submissions.

Workshop date: July 15th, 2024 (Full day)

Workshop Papers

Spotlight talks: Accepted papers will be presented as 1min spotlight talks (in-person or pre-recorded videos) in the 3pm session.

Posters Session: Papers will also present a poster during the poster sessions

Accepted Papers:

[PDFs and 1min video coming soon!]

3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
Learning Visuotactile Skills with Two Multifingered Hands
Few-Shot Task Learning Through Inverse Generative Modeling
OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning
IRASim: Learning Interactive Real-Robot Action Simulators
Learning Human Preferences from Open-Ended Dialog (Best Paper Runner-Up)
Temporally Entangled Diffusion Models for Fast Robotic Control
Grounding Embodied Question-Answering with State Summaries from Existing Robot Modules
Distillation of Diffusion Models into Fast and Tractable Mixture of Experts
Dynamics-Aware Trajectory Generation for Artistic Painting using Diffusion
Enhancing Surgical Autonomy: Multi-Modal Large Language Models for Human-Like Behavior Generation in Robot-Assisted Blood Suction
Grounding Language Plans in Demonstrations through Counterfactual Perturbations
PRIMP: PRobabilistically-Informed Motion Primitives for Efficient Affordance Learning from Demonstration (Best Paper)
Boosting Robot Behavior Generation with Large Language Models and Genetic Programming
Don't Yell at Your Robot: Physical Correction as the Collaborative Interface for Language Model Powered Robots
MaIL: Improving Imitation Learning with Mamba