Emergent Complexity via Multi-agent Competition
Code for Environments and Trained Policies: https://github.com/openai/multiagent-competition
Code for Environments and Trained Policies: https://github.com/openai/multiagent-competition
Task 1: Run to Goal
Task 1: Run to Goal
ants-run-to-goal.mp4
humans-run-to-goal.mp4
Task 2: You Shall Not Pass
Task 2: You Shall Not Pass
humans-you-shall-not-pass.mp4
Task 3: Sumo
Task 3: Sumo
Ant-Sumo.mp4
sumo-fights.mp4
Task 4: Kick and Defend
Task 4: Kick and Defend
kick_defend_compilation_old.mp4
kick-and-defend-robust.mp4
Training against Ensemble of Policies
Training against Ensemble of Policies
Sumo agent trained in an ensemble of 3 policies
sumo-ensemble.mp4
Robustness of learnt policy to wind-attack
Robustness of learnt policy to wind-attack
Right: Humanoid trained on walking
Left: Humanoid trained on Sumo
The length of the arrow is indicative of the applied force which varies from 400 to 800
wind-attack-sumo.mp4
wind-attack-classic.mp4
Effect of Exploration Curriculum
Effect of Exploration Curriculum
Left: Kick and Defend agents trained without curriculum (no annealing of the dense exploration reward)
Right: Humanoid Sumo agent trained without curriculum (no annealing of the dense exploration reward)
nocurri-football.mp4
nocurri-sumo.mp4