In the single-agent version of Google Research Football the AI controls the player closest to the ball and the rest of the players is controlled by a built-in heuristic. We focus on the multiagent version of Google Research Football, where each of the players can be controlled separately. This webpage summarizes the progress of our experiments and bellow you can see some of interesting behaviors that emerged so far. In particular we have discovered in some training scenarios that our agents love offsides and learned to relay on them completely.
Our goal is to train a set of agents starting from random policy to a cooperating team. The mean to this is self-play with current and previous versions of the same policy. We are also improving tools which often have no full support for the multi-agent setup. See the details on our blog.
Our team members:
One of the first things our agents learn is an offside mechanics. Here you can see what happens when they start to overuse it.
SEED RL is a training algorithm with open source implementation that work out of the box on Google Cloud. Here we describe how to use it to train football agents.
Training using scoring reward is hard because of sparse rewards - an agent rarely gets feedback about his performance. This is why we introduced a curriculum of academy-like scenarios with increasing difficulty.