SeqWM: Empowering Multi-Robot Cooperation via Sequential World Models

Abstract

To address the difficulty of applying model-based reinforcement learning (MBRL) to multi-robot systems, we propose the Sequential World Model (SeqWM). This framework decomposes complex joint dynamics by using independent, sequentially-structured models for each agent. Planning and decision-making occur via sequential communication, where each agent bases its actions on the predictions of its predecessors. This design enables explicit intention sharing, boosts cooperative performance, and reduces communication complexity. Results show SeqWM outperforms state-of-the-art methods in simulations and real-world deployments, achieving advanced behaviors like predictive adaptation and role division.

📈 Curves

SeqWM consistently outperforms state-of-the-art baselines in both Multi-Quad and Bi-DexHands environments.

🎥 Demos

SeqWM solves complex bimanual tasks and supports cooperation to 2–5 quadruped robots.

🌍 Sim2Real Deployment

SeqWM has also been successfully deployed on physical Unitree Go2-W robots, confirming effective sim-to-real transfer.

🧩 Advanced Cooperative Behaviors

As shown below, the robots exhibit predictive adaption, temporal alignment and role division: some agents slow down in front of the gate (observable as troughs in their x-axis velocity commands), while others accelerate and pass through first (peaks in velocity commands). This wave-like pattern across agents reflects turn-taking and priority management, enabling smooth passage without collisions even in highly constrained environments.

Page updated

Google Sites

Report abuse