Learning Design and Construction with Varying-Sized Materials via Prioritized Memory Resets

Yunfei Li, Tao Kong, Lei Li and Yi Wu

Accepted as a Contributed Paper to ICRA 2022

Abstract

Can a robot autonomously learn to design and construct a bridge from varying-sized blocks without a blueprint? It is a challenging task with long horizon and sparse reward -- the robot has to figure out physically stable design schemes and feasible actions to manipulate and transport blocks. Due to diverse block sizes, the state space and action trajectories are vast to explore. In this paper, we propose a hierarchical approach for this problem. It consists of a reinforcement-learning designer to propose high-level building instructions and a motion-planning-based action generator to manipulate blocks at the low level. For high-level learning, we develop a novel technique, prioritized memory resetting (PMR) to improve exploration. PTR adaptively resets the state to those most critical configurations from a replay buffer so that the robot can resume training on partial architectures instead of from scratch. Furthermore, we augment PTR with auxiliary training objectives and fine-tune the designer with the locomotion generator. Our experiments in simulation and on a real deployed robotic system demonstrate that it is able to effectively construct bridges with blocks of varying sizes at a high success rate.

Learned strategies

A short bridge using 3 blocks

A medium-size bridge using 5 blocks

A long bridge with 7 blocks

Different discovered modes given the same task configuration

Mode #1

Mode #2

Mode #3

Failure case

The agent accidentally knocks down built structure when placing the purple block, then it clears the scene by moving the blocks out of the valley, and tries to construct again. The episode terminates due to exceeding time limit.