Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real

Ofir Nachum, Michael Ahn, Hugo Ponte, Shixiang Gu, and Vikash Kumar

Two D"Kitties co-ordinate to push a heavy block to the target (marked by + signs on the floor)

What's the Scoop?

We have developed Hierarchical Sim2Real, a paradigm for learning complex manipulation via locomotion behaviors on real-world robots.

The Details

Hierarchical Sim2Real is a general paradigm for modular transfer of learned behaviors to the real-world.

One first trains a low-level goal-reaching policy in simulation. Appropriate domain randomizations are applied (e.g., random terrain as seen below). The learned policy can successfully transfer to real-world environments in a zero-shot fashion.

A high-level goal-proposing policy is then learned on top of it to solve a more complex task. Training again happens in simulation with appropriate domain randomizations. Crucially, due to the hierarchical nature, the domain randomizations are different and often simpler than they are during low-level training (e.g., simple random action noise as shown below). The learned policy is transfered to real-world environments in a zero-shot fashion.