A multi-agent reinforcement learning benchmark for research and industry
Despite the fast development of multi-agent reinforcement learning (MARL) algorithms, there is a lack of commonly-acknowledged baseline implementation and evaluation benchmarks.
An urgent need for MARL researchers is to develop a unified benchmark suite, similar to the role of RLlib in single-agent RL, that can support both high-performance MARL implementations and replicable evaluations in various testing environments.
MARLlib includes by far the most comprehensive list of MARL algorithms, covering different categories on either the game type (cooperative, competitive, mixed), the action space (discrete, continues, multi-discrete), or the decision mode (turn-based, simultaneous).
MARLlib introduces a unified interface for MARL implementations.
10 diverse environments are available, with new ones easy to be incorporated.
MARLlib decouples algorithms, neural architectures, and environments, thereby offering great flexibility for systematic benchmarking with enhanced debugging tools and easy-to-read codes.
MARLlib provides replicable and reloadable performance results across tens of environments, each with detailed hyper-parameter settings and training logs
MARLlib unify both the multi-agent environment and MARL algorithms in one framework
In MARLlib, we mainly implement and unify three types of algorithms that belong to independent learning (IL), centralized critic (CC), and value decomposition (VD) considering their broad topic coverage.
In MARLlib environments, agents are not required to act simultaneously. No transition data is shared among agents except terminal signal done. Multiple environments/tasks are supported. See the figure below for details.