COMPOSER: Scalable and Robust Modular Policies for Snake Robots

Yuyou Zhang,   Yaru Niu,  Xingyu LiuDing Zhao

Carnegie Mellon University

Introduction

The inherent modularity in snake robots can be viewed from three perspectives: high dimensionality, scalability, and redundancy. In this work, we consider the snake robot as a modular robot and formulate the control of the snake robot as cooperative Multi-Agent Reinforcement Learning (MARL). Each segment of the snake robot operates as an independent module, relying on local observations to determine its actions. Our proposed method COMPOSER, incorporates a self-attention mechanism to enhance the cooperative behavior between agents and employs a high-level imagination policy to enable more efficient learning. The proposed method COMPOSER demonstrates superior efficiency, robustness against agent corruption, and zero-shot generalizability. 

Framework overview of COMPOSER. Snake robot with n joints is formulated as n agents. The modular control policy outputs individual torque commands, while the imagination policy forecasts an ideal displacement per step. The control policy is trained to both complete the task and adhere to the direction prescribed by the imagination policy.

Experiments

Goal-reaching with random goals

Block-pushing

Success: 

push

Success: 

fling

Failure: 

multiple attempts with contact adjustment

Shape-formation

Success: 

target shape with small curvature

Success: 

target shape with sharp curvature

Failure: 

target shape with sharp curvature

Tube-crossing

Success

Wall climbing

Success: 

vertical wall climb (w/ imagination)

Success: 

horizontal wall traverse (w/o imagination)

Emergent locomotion patterns

Lateral Undulation :

the most common serpentine locomotion

Slide-Pushing :

used when the snake is startled and tries to escape quickly

Lateral Undulation w/o imagination

Slide-pushing w/ imagination

Zero-shot Generalizability

9-agent

10-agent

11-agent

Robustness

COMPOSER success rate: 65/100

PPO success rate: 22/100