LEARN 2 ASSEMBLE
Learn2Assemble with Structured Representations and Search for Robotic Architectural Construction
Niklas Funk, Georgia Chalvatzaki, Boris Belousov, Jan Peters
Conference on Robot Learning (CoRL) 2021
Abstract: Autonomous robotic assembly requires a well-orchestrated sequence of high-level actions and smooth manipulation executions. The problem of learning to assemble complex 3D structures remains challenging, as it requires drawing connections between target shapes and available building blocks, as well as creating valid assembly sequences with respect to stability and kinematic feasibility in the robot's workspace. We design a hierarchical control framework that learns to sequence the building blocks to construct arbitrary 3D designs and ensures that they are feasible, as we plan the geometric execution with the robot-in-the-loop. Our approach draws its generalization properties from combining graph-based representations with reinforcement learning (RL) and ultimately adding tree-search. Combining structured representations with model-free RL and Monte-Carlo planning allows agents to operate with various target shapes and building block types. We demonstrate the flexibility of the proposed structured representation and our algorithmic solution in a series of simulated 3D assembly tasks with robotic evaluation, which showcases our method's ability to learn to construct stable structures with a large number of building blocks.
In the following, we provide more videos supplementing this work's experiments.
Evaluating Graph Architecture
Building Structures using the Multi Head Attention (MHA) and Structure2Vec (S2V) Graph Architectures
![](https://www.google.com/images/icons/product/drive-32.png)
mha
Using MHA yields more efficient policies.
![](https://www.google.com/images/icons/product/drive-32.png)
s2v
Using S2V results in policies placing blocks at unnecessary locations and failing to fill the entire structure more often.
![](https://www.google.com/images/icons/product/drive-32.png)
SHA
While SHA outperforms S2V, using multiple attention heads (MHA) is still superior.
Evaluating Graph Connectivity
Building Structures using the partially connected (pc) and fully-connected (fc) setups
![](https://www.google.com/images/icons/product/drive-32.png)
PC (partial Connectivity)
Results in more efficient building, i.e. using less blocks for filling the structures.
![](https://www.google.com/images/icons/product/drive-32.png)
FC (full connectivity)
Uses more blocks compared to the pc setup.
Evaluating Learning Algorithms
Building Structures using the different learning algorithms
![](https://www.google.com/images/icons/product/drive-32.png)
DQN
Using DQN to build the structure results in terminating upon an invalid action.
![](https://www.google.com/images/icons/product/drive-32.png)
DQN + MCTS (budget 10)
Adding MCTS to the DQN agent results in succesfully solving the task.
![](https://www.google.com/images/icons/product/drive-32.png)
Epsilon-MCTS (budget 10)
The epsilon-mcts agent also succesfully solves this task.
![](https://www.google.com/images/icons/product/drive-32.png)
Q-MCTS (budget 10)
The Q-MCTS agent, like the DQN agent fails to complete the task and finishes with an invalid action.
Evaluating Training with (w) and without (wo) the Robot-in-the-loop
Building Structures using the robot-in-the-loop and investigating whether training with or without the robot matters
![](https://www.google.com/images/icons/product/drive-32.png)
Trained w robot
The agent trained with the robot-in-the-loop is capable of building complex shapes without the robot destroying the structure.
![](https://www.google.com/images/icons/product/drive-32.png)
trained wo Robot
Training without the robot-in-the-loop is not sufficient. This results in policies that fail to build stable structures due to the robotic arm colliding with the actual structure.
Evaluating building complex Shapes with the Robot-in-the-loop
Building complex two- and four-sided structures with the robot, using DQN+MCTS (search budget 10).
![](https://www.google.com/images/icons/product/drive-32.png)
two-sided
![](https://www.google.com/images/icons/product/drive-32.png)
four-sided
Evaluating Generalization w.r.t randomized Scenes
Building complex two-sided structures with the robot, using a policy trained in the above setting (DQN+MCTS search budget 10) and thus evaluating its capability to generalize to novel (unseen) scenarios.
![](https://www.google.com/images/icons/product/drive-32.png)
TWO-SIDED IN UNSEEN ENVIRONMENT
Evaluating Generalization w.r.t building Blocks
Building complex two-sided structures with and without the robot, using multiple different building blocks (DQN+MCTS search budget 10).
![](https://www.google.com/images/icons/product/drive-32.png)
WITH THE ROBOT
![](https://www.google.com/images/icons/product/drive-32.png)
WITHOUT THE ROBOT
Evaluating Transfer to a real Robot
Building a single sided structure on real hardware using a different manipulator. This is achieved by initializing a simulation scene that mirrors the reality (see top left hand corner) and executing the desired actions on the real system.
![](https://www.google.com/images/icons/product/drive-32.png)