Agent-Centric Representations for
Multi-Agent Reinforcement Learning
Wendy Shang (Deepmind)
Lasse Espeholt, Anton Raichuk, Tim Salimans (Google Research)
Object-centric representations have recently enabled significant progress in tackling relational reasoning tasks. By building a strong object-centric inductive bias into neural architectures, recent efforts have improved generalization and data efficiency of machine learning algorithms for these problems. One problem class involving relational reasoning that still remains under-explored is multi-agent reinforcement learning (MARL). Here we investigate whether object-centric representations are also beneficial in the fully cooperative MARL setting. Specifically, we study two ways of incorporating an agent-centric inductive bias into our RL algorithm: 1. Introducing an agent-centric attention module with explicit connections across agents 2. Adding an agent-centric unsupervised predictive objective (i.e. not using action labels), to be used as an auxiliary loss for MARL, or as the basis of a pre-training step. We evaluate these approaches on the Google Research Football environment as well as DeepMind Lab 2D. Empirically, agent-centric representation learning leads to the emergence of more complex cooperation strategies between agents as well as enhanced sample efficiency and generalization.
Built-in AI vs Self-Play AI
Both policies are used as (fixed) training opponents in our GRF experiments.
Built-in AI is very simple and exploitable.
Self-play AI (trained from self-play) is much harder, more robust and generalizable.
Thus, results on self-play AI better reflect an algorithm's capability in handling tasks demanding complex co-op.
Top, Bottom Row: attention head 1, 2
Column 1, 2, 3, 4: Player 1, 2, 3, 4 (in green)
Yellow: home player agents
Blue: opponent player agents (self-play AI)
Red Circle: attention intensity