Low-Rank Modular Reinforcement Learning via Muscle Synergy
Heng Dong*, Tonghan Wang*, Jiayuan Liu, Chongjie Zhang
Heng Dong*, Tonghan Wang*, Jiayuan Liu, Chongjie Zhang
Abstract
Modular Reinforcement Learning (RL) decentralizes the control of multi-joint robots by learning policies for each actuator. Previous work on modular RL has proven its ability to control morphologically different agents with a shared actuator policy. However, with the increase in the Degree of Freedom (DoF) of robots, training a morphology-generalizable modular controller becomes exponentially difficult. Motivated by the way the human central nervous system controls numerous muscles, we propose a Synergy-Oriented LeARning (SOLAR) framework that exploits the redundant nature of DoF in robot control. Actuators are grouped into synergies by an unsupervised learning method, and a synergy action is learned to control multiple actuators in synchrony. In this way, we achieve a low-rank control at the synergy level. We extensively evaluate our method on a variety of robot morphologies, and the results show its superior efficiency and generalizability, especially on robots with a large DoF like Humanoids++ and UNIMALs.
Framework
The intra-synergy attention module aggregates actuator information within each synergy. The inter-synergy attention module synthesizes information from all synergies to produce synergy actions. Synergy actions are then transformed linearly to obtain actuator actions. Actuator actions are of a lower rank, reducing the control complexity.
Performance
Multi-task with different morphologies
Multi-task performance of our method SOLAR compared to baselines and ablations.
Multi-task performance of our method SOLAR compared to baseline and ablations on Humanoid++.
Zero-shot generalization
Zero-shot performance of our method SOLAR compared against baseline and ablations.
Analysis
Synergy clustering results of SOLAR in Humanoid++. Different colors represent different synergy clusters, and joints marked with the same color are in the same cluster. Joints marked with triangles are the centers of their corresponding clusters.
Synergy structure evolution of SOLAR in Humanoid. Phases are divided according to the change of synergy clusters and synergy clusters are masked with colored shapes.
Video