Towards Disturbance-Free Visual Mobile Manipulation

Abstract:

Embodied AI has shown promising results on an abundance of robotic tasks in simulation, including visual navigation and manipulation. The prior work generally pursues high success rates with shortest paths while largely ignoring the problems caused by collision during interaction. This lack of prioritization is understandable: in simulated environments there is no inherent cost to breaking virtual objects. As a result, well-trained agents frequently have catastrophic collision with objects despite final success. In the robotics community, where the cost of collision is large, collision avoidance is a long-standing and crucial topic to ensure that robots can be safely deployed in the real world. In this work, we take the first step towards collision/disturbance-free embodied AI agents for visual mobile manipulation, facilitating safe deployment in real robots. We develop a new disturbance-avoidance methodology at the heart of which is the auxiliary task of disturbance prediction. When combined with a disturbance penalty, our auxiliary task greatly enhances sample efficiency and final performance by knowledge distillation of disturbance into the agent. Our experiments on ManipulaTHOR show that, on testing scenes with novel objects, our method improves the success rate from 61.7% to 85.6% and the success rate without disturbance from 29.8% to 50.2% over the original baseline. Extensive ablation studies show the value of our pipelined approach.

Success with Disturbance in Visual Mobile Manipulation (ManipulaTHOR)

Our Model Architecture with the Auxiliary Task of Disturbance Prediction

Demonstrations on the Unseen Scenes

We show both the ego-centric and top-down views (in RGB) in the following videos attached with task description, while agents were trained with ego-centric depth maps and coordinate goal information.

Compared Methods:

  • Baseline: trained without any auxiliary task and without disturbance penalty

  • Inv Dyn: trained with Inverse Dynamics task and with disturbance penalty

  • CPC|A: trained with CPC|A task and with disturbance penalty

  • Ours: trained with disturbance prediction task and with disturbance penalty

Case 1: Without Disturbance

Task: bring the apple from the stove burner to the chair

log_ind_003_pickup_succ_episode_fail_FloorPlan14_obj_Apple_from_StoveBurner_to_Chair.mp4

Baseline [Failure]: Disturb many objects such as chairs

log_ind_003_pickup_succ_episode_fail_FloorPlan14_obj_Apple_from_StoveBurner_to_Chair.mp4

Inv Dyn [Failure]: Disturb many objects such as chairs

log_ind_003_pickup_succ_episode_fail_FloorPlan14_obj_Apple_from_StoveBurner_to_Chair.mp4

CPC|A [Failure]: Disturb many objects such as chairs

log_ind_003_pickup_succ_episode_succ_FloorPlan14_obj_Apple_from_StoveBurner_to_Chair.mp4

Ours [Success]: Without disturbance

Case 2: Without Disturbance

Task: bring the mug from the dining table to the shelf

log_ind_049_pickup_succ_episode_fail_FloorPlan20_obj_Mug_from_DiningTable_to_Shelf.mp4

Baseline [Failure]: Fail to place the object

log_ind_049_pickup_fail_episode_fail_FloorPlan20_obj_Mug_from_DiningTable_to_Shelf.mp4

Inv Dyn [Failure]: Fail to pick up the object

log_ind_049_pickup_succ_episode_fail_FloorPlan20_obj_Mug_from_DiningTable_to_Shelf.mp4

CPC|A [Failure]: Get stuck with the stool

log_ind_049_pickup_succ_episode_succ_FloorPlan20_obj_Mug_from_DiningTable_to_Shelf.mp4

Ours [Success]: Without disturbance

Case 3: Avoid Bringing Another Object

Task: bring the unseen spatula from the dining table to the stove burner

log_ind_009_pickup_succ_episode_fail_FloorPlan26_obj_Spatula_from_DiningTable_to_StoveBurner.mp4

Baseline [Failure]: Bring the undesired bowl

log_ind_009_pickup_succ_episode_fail_FloorPlan26_obj_Spatula_from_DiningTable_to_StoveBurner.mp4

Inv Dyn [Failure]: Bring the undesired bowl

log_ind_009_pickup_succ_episode_fail_FloorPlan26_obj_Spatula_from_DiningTable_to_StoveBurner.mp4

CPC|A [Failure]: Bring the undesired bowl

log_ind_009_pickup_succ_episode_succ_FloorPlan26_obj_Spatula_from_DiningTable_to_StoveBurner.mp4

Ours [Success]: Without taking the other object