COMPOSER: Scalable and Robust Modular Policies for Snake Robots
Yuyou Zhang, Yaru Niu, Xingyu Liu, Ding Zhao
Carnegie Mellon University
COMPOSER: Scalable and Robust Modular Policies for Snake Robots
Yuyou Zhang, Yaru Niu, Xingyu Liu, Ding Zhao
Carnegie Mellon University
The inherent modularity in snake robots can be viewed from three perspectives: high dimensionality, scalability, and redundancy. In this work, we consider the snake robot as a modular robot and formulate the control of the snake robot as cooperative Multi-Agent Reinforcement Learning (MARL). Each segment of the snake robot operates as an independent module, relying on local observations to determine its actions. Our proposed method COMPOSER, incorporates a self-attention mechanism to enhance the cooperative behavior between agents and employs a high-level imagination policy to enable more efficient learning. The proposed method COMPOSER demonstrates superior efficiency, robustness against agent corruption, and zero-shot generalizability.
Framework overview of COMPOSER. Snake robot with n joints is formulated as n agents. The modular control policy outputs individual torque commands, while the imagination policy forecasts an ideal displacement per step. The control policy is trained to both complete the task and adhere to the direction prescribed by the imagination policy.
The snake robot learns manipulation skills including push, fling, and contact adjustment.
Success:
push
Success:
fling
Failure:
multiple attempts with contact adjustment
The shape-formation task involves both locomotion and deformation control.
Success:
target shape with small curvature
Success:
target shape with sharp curvature
Failure:
target shape with sharp curvature
The tube-crossing task involves navigating in a confined space with curved terrain, requiring effective interactions with the environment.
Success
In the wall climbing task, vertical wall climb indicates quick and efficient task success brought by the imagination policy.
Success:
vertical wall climb (w/ imagination)
Success:
horizontal wall traverse (w/o imagination)
Lateral Undulation :
the most common serpentine locomotion
Slide-Pushing :
used when the snake is startled and tries to escape quickly
In the goal-reaching task, the modular policy, when combined with imagination, exhibited the "Slide-pushing" pattern, whereas without imagination, it resulted in "Lateral Undulation."
Lateral Undulation w/o imagination
The "Slide-pushing" pattern highlights the contribution of the imagination policy in achieving efficient task success.
Slide-pushing w/ imagination
A modular policy trained on an 8-agent snake can generalize to longer snakes, in a zero-shot manner.
9-agent
10-agent
11-agent
Robustness is tested on snake robots with one randomly selected actuator being disabled.
COMPOSER success rate: 65/100
PPO success rate: 22/100