Emergent Complexity via Multi-agent Competition

Paper: https://arxiv.org/abs/1710.03748

ants-run-to-goal.mp4

humans-run-to-goal.mp4

humans-you-shall-not-pass.mp4

Ant-Sumo.mp4

sumo-fights.mp4

kick_defend_compilation_old.mp4

kick-and-defend-robust.mp4

Sumo agent trained in an ensemble of 3 policies

sumo-ensemble.mp4

Right: Humanoid trained on walking

Left: Humanoid trained on Sumo

The length of the arrow is indicative of the applied force which varies from 400 to 800

wind-attack-sumo.mp4

wind-attack-classic.mp4

Left: Kick and Defend agents trained without curriculum (no annealing of the dense exploration reward)

Right: Humanoid Sumo agent trained without curriculum (no annealing of the dense exploration reward)

nocurri-football.mp4

nocurri-sumo.mp4