Soft Robots Learn to Crawl:
Jointly Optimizing Design and Control with Sim-to-Real Transfer

Expert-designed baseline robot (2x speed)
Our optimized design and controller (2x speed)

Abstract

This work provides a complete framework for the simulation, co-optimization, and sim-to-real transfer of the design and control of soft legged robots. The compliance of soft robots provides a form of “mechanical intelligence”—the ability to passively exhibit behaviors that would otherwise be difficult to program. Exploiting this capacity requires careful consideration of the coupling between mechanical design and control. Co-optimization provides a promising means to generate sophisticated soft robots by reasoning over this coupling. However, the complex nature of soft robot dynamics makes it difficult to provide a simulation environment that is both sufficiently accurate to allow for sim-to-real transfer, while also being fast enough for contemporary co-optimization algorithms. In this work, we show that finite element simulation combined with recent model order reduction techniques provide both the efficiency and the accuracy required to successfully learn effective soft robot design-control pairs that transfer to reality. We propose a reinforcement learning-based framework for co-optimization and demonstrate successful optimization, construction, and zero-shot sim-to-real transfer of several soft crawling robots. Our learned robot outperforms an expert-designed crawling robot, showing that our approach can generate novel, high-performing designs even in well-understood domains.

Optimized Design-Controller Pairs

Our framework jointly learns the design and control of crawling soft robots (top) that outperform an expert-designed baseline (bottom). While trained exclusively in simulation, our learned robots are capable of zero-shot sim-to-real transfer, with the optimal design moving more than 2× faster than the baseline in the real world.


Co-optimization via Multitask RL

We formulate the co-optimization of design and control as a multitask RL problem with a set of related Markov decision processes (MDPs) that each contain a different robot design. Our approach learns a single design-conditioned controller on a mixture of data from each MDP and maintains a distribution over design candidates. At each iteration, the method controls a set of designs sampled from the design distribution and updates the policy with soft actor-critic. After an initial training phase, we update the design distribution based on the episode returns of each sampled design.

Design-Reconfigurable Model Order Reduction

An important prerequisite for co-optimization is a simulator that is both fast enough to explore a large set of designs and control strategies and accurate enough to ensure that the learned robots are physically realizable and capable of sim-to-real transfer. To achieve this, we combine finite element simulation with design-reconfigurable model order reduction. We employ the snapshot proper orthogonal decomposition (POD) method to reduce a set of composable parts. Our approach collects data from a diverse set of designs, gathers each part across designs, and transforms them into a common reference frame, and, for each part, performs a snapshot-POD reduction and hyperreduction. The reduced order models for each part can then be combined to quickly create reduced order models of soft robots with varying morphologies.

Experiments

We test our approach by attempting to find a pneumatically actuated soft robot and its controller that crawls as fast as possible. We take the five best robots discovered by our algorithm and test their zero-shot sim-to-real transfer performance. We show that our optimized robots perform significantly better than an expert-designed baseline, with the top robot moving over 2x faster than the baseline in the real world.

Video