HITL-TAMP
Human-In-The-Loop Task and Motion Planning for Imitation Learning

Overview

HITL-TAMP decomposes a robot manipulation task (such as making coffee) into planning-based and learning-based control segments. The planning-based segments are handled by a TAMP planner and the learning-based segments are handled by a human demonstrator (during data collection) and a policy trained on demonstrations (during policy deployment).

Contributions

We develop HITL-TAMP, an efficient data-collection system for long-horizon and contact-rich manipulation tasks that synergistically combines and trades off control between a TAMP system and a human teleoperator.
HITL-TAMP contains novel components including (1) a mechanism that allows TAMP to learn planning conditions from a small number of demonstrations and (2) a queueing system that allows an operator to manage a fleet of parallel data collection sessions.
We conduct a user study across 15 users to compare HITL-TAMP with a conventional teleoperation system. Users collectively gathered over 1.4K demos, more than 3x the conventional system, given the same time budget. Proficient agents (over 75%) could be trained from just 10 minutes of non-expert teleoperation data.
We collected an additional 2.1K demos with HITL-TAMP across 12 contact-rich and long-horizon tasks and show that HITL-TAMP often produces near-perfect agents.

HITL-TAMP trains proficient real-world agents on challenging contact-rich and long-horizon manipulation

Below, we show several uncut videos of HITL-TAMP agent performance on real-world manipulation.

Tool Hang

Real-World Tool Hang Agent (64% success and 88% frame insertion success from 50 HITL-TAMP demos)

Stack Three

Real-World Stack Three Agent (62% success from 100 HITL-TAMP demos)

Coffee Broad (both sides)

Real-World Coffee Broad Agent (66% success from 100 HITL-TAMP demos)

Coffee

Real-World Coffee Agent (74% success from 100 HITL-TAMP demos)

Coffee Broad

Real-World Coffee Broad Agent (66% success from 100 HITL-TAMP demos)

HITL-TAMP greatly accelerates data collection and often trains near-perfect agents from this data

Dataset Visualization

HITL-TAMP's queueing system allows human operators to manage fleets of asynchronously-running data collection sessions. This allows each operator to more effectively use their time and scale the amount of data that they collect. We collected 2.1K+ demos using the system across these tasks.

Square

Square Broad

Three Piece Assembly

Three Piece Assembly Broad

Coffee

Coffee Broad

Coffee Preparation

Tool Hang

Tool Hang Broad

Trained Policies

Each policy below was trained on 200 demos from HITL-TAMP. Several policies are near-perfect.

Square (100%)

Square Broad (100%)

Three Piece Assembly (100%)

Three Piece Assembly Broad (85%)

Coffee (100%)

Coffee Broad (99%)

Coffee Preparation (96%)

Tool Hang (81%)

Tool Hang Broad (49%)

HITL-TAMP enables novice operators to demonstrate tasks efficiently and policies to be learned from just 10 minutes of data, in contrast to conventional teleoperation systems

The policies below were trained on just 10 minutes of operator data from an operator with little to no experience with teleoperation. We compare policies trained on 10 minutes of HITL-TAMP data and 10 minutes of conventional teleoperation data.

Coffee (HITL-TAMP) (100%)

Coffee (Conventional) (28%)

Square Broad (HITL-TAMP) (84%)

Square Broad (Conventional) (0%)

Three Piece Assembly Broad (HITL-TAMP) (22%)

Three Piece Assembly Broad (Conventional) (0%)

Task Reset Distributions

Square

Square Broad

Three Piece Assembly

Three Piece Assembly Broad

Coffee

Coffee Broad

Coffee Preparation

Tool Hang

Tool Hang Broad

Stack Three Real

Tool Hang Real

Coffee Real

Coffee Broad Real

Real World Data Collection

The videos below are from the HITL-TAMP datasets that we used to train our real world agents. The video resolution matches the image resolution used for policy training.