ICRA 2025 Workshop:

Beyond Pick and Place —
Unifying Learning-Based and Model-Based Approaches for Contact-Rich Manipulation

Friday, May 23rd 2025

Location: GWCC Meeting Room 412

Recordings are available here or by clicking each presenter below.

In recent years, robot learning has made impressive strides toward generally capable manipulation agents. Solutions powered by foundation models and extensive collections of human demonstrations have shown impressive generalization, while others based on pretrained vision/language models have exploited their common sense knowledge for diverse manipulation abilities. However, the demonstrated dexterity of these systems usually leaves one wanting: most works are limited to simple pick-and-place behaviors, far from the contact-rich skills robots need to accomplish many industrial and household tasks, from insertion and assembly to cooking and dexterous tool use. Meanwhile, recent model-based control and planning works have demonstrated impressive dexterity in contact-rich tasks, yet lack the generalization and scalability of data-driven approaches. This workshop brings together world-class experts from academia and industry to facilitate discussion around integrating these paradigms, with the goal of enabling robots to robustly perform diverse contact-rich tasks in the open world.

This workshop seeks to chart the course for the development of such manipulation behaviors beyond pick-and-place, building toward robots that can autonomously solve contact-rich manipulation problems in the open world, addressing the following questions:

How can we leverage model-based manipulation planning, task and motion planning, and optimal control together with learning-based methods for contact-rich manipulation?
How much can we learn about contact-rich interactions from images alone? What additional sensing modalities are required, and how can we scale data collection efforts? How should we use other sources, like human videos?
What are appropriate action spaces for contact-rich manipulation? How do we integrate in-hand and non-prehensile manipulation with prehensile abilities?
How can we enable safe, autonomous learning of contact-rich skills in the real world, through algorithmic and mechanical means?

Speakers

Russ Tedrake
(MIT / TRI)

Talk

Michael Posa
(UPenn)

Talk

Rachel Holladay
(UPenn)

Talk

David Held
(CMU)

Talk

Katerina Fragkiadaki
(CMU)

Talk

Yifan Hou
(Stanford)

Talk

Maria Bauza
(Google Deepmind)

Talk

Iretiayo Akinola
(NVIDIA)

Talk

Hae-Won Park
(KAIST)

Talk

Workshop Schedule

09:00 - 09:15 Opening Remarks

09:15 - 09:45 Yifan Hou: Empower Robot Learning with Model-based Manipulation

09:45 - 10:15 Hae-Won Park: Contact-implicit Control and Estimation: Legged Robots and More

10:15 - 10:30 Paper Spotlights

10:30 - 11:00 Coffee + Posters

11:00 - 11:30 Ireti Akinola: A Simulation-First Approach to Robotic Assembly: Towards sensor-full contact-rich manipulation

11:30 - 12:00 Maria Bauza: Are we ready to go beyond pick and place?

12:00 - 13:00 Lunch Break

13:00 - 13:30 Katerina Fragkiadaki: 3D Generative Manipulation Policies and Object Dynamics

13:30 - 14:00 David Held: Spatially-aware Robot Manipulation

14:00 - 14:15 Paper Spotlights

14:15 - 14:45 Rachel Holladay

14:45 - 15:15 Michael Posa: Dexterity and generalization? A path toward to contact-rich model learning and control

15:15 - 15:30 Paper Spotlights

15:30 - 16:00 Coffee + Posters

16:00 - 16:30 Russ Tedrake: Multitask pretraining for dexterous, contact-rich manipulation

16:30 - 17:15 Debate / Panel: “Charting the Path Toward Contact-rich Manipulation”

featuring Russ Tedrake, Michael Posa, Rachel Holladay, Maria Bauza

17:15 - 17:30 Presentation of “Insights for Robot Learning Beyond Pick-and-Place”

17:30 - 17:45 Closing remarks, Best Paper award

Accepted Papers

See the venue on OpenReview here.

Spotlight Session 1

Streaming Flow Policy: Simplifying diffusion/flow policies by treating action trajectories as flow trajectories

Sunshine Jiang, Xiaolin Fang, Nicholas Roy, Tomás Lozano-Pérez, Leslie Pack Kaelbling, Siddharth Ancha

DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation

Jiangran Lyu, Ziming Li, Xuesong Shi, Chaoyi Xu, Yizhou Wang, He Wang

Tool-as-Interface: Learning Robot Tool Use from Human Play through Imitation Learning

Haonan Chen, Cheng Zhu, Yunzhu Li, Katherine Rose Driggs-Campbell

FoAR: Force-Aware Reactive Policy for Contact-Rich Robotic Manipulation

Zihao He, Hongjie Fang, Jingjing Chen, Hao-Shu Fang, Cewu Lu

Spotlight Session 2

GET-Zero: Graph Embodiment Transformer for Zero-shot Embodiment Generalization

Austin Patel, Shuran Song

Metric Semantic Manipulation-Enhanced Mapping via Belief Prediction Models

Nils Dengler, Joao Marcos Correia Marques, Jesper Mücke, Shenlong Wang, Kris Hauser, Maren Bennewitz

AugInsert: Learning Robust Visual-Force Policies via Data Augmentation for Object Assembly Tasks

Ryan Diaz, Adam Imdieke, Vivek Veeriah, Karthik Desingh

Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments

Haritheja Etukuru, Norihito Naka, Zijin Hu, Seungjae Lee, Chris Paxton, Soumith Chintala, Lerrel Pinto, Nur Muhammad Mahi Shafiullah

Spotlight Session 3

Mobile Pedipulation for Object Sliding via a Wheeled Bipedal Robot

Yue Qin, Yanran Ding

SAIL: Faster-than-Demonstration Execution of Imitation Learning Policies

Nadun Ranawaka Arachchige, Zhenyang Chen, Wonsuhk Jung, Woo Chul Shin, Rohan Bansal, Yu Hang He, Yingyan Celine Lin, Benjamin Joffe, Shreyas Kousik, Danfei Xu

Learning Precise, Contact-Rich Manipulation through Uncalibrated Tactile Skins

Venkatesh Pattabiraman, Yifeng Cao, Siddhant Haldar, Lerrel Pinto, Raunaq Bhirangi

Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation

Han Xue, Jieji Ren, Wendi Chen, Gu Zhang, Fang Yuan, Guoying Gu, Huazhe Xu, Cewu Lu