Run GFootball with SEED

In this tutorial we will show how to train GFootball agents using Scalable, Efficient Deep-RL (SEED RL) - a scalable reinforcement learning agent that allows training up to million of frames per second. An open-source implementation can be found in this repository. It will be shown how to run an experiment on AI Platform as well as how to modify the SEED repository in order to allow a multi-agent setup.

Prerequisites

Git

apt-get install git - install git

Docker

Having docker installed is necessary. If you don't have it installed follow these steps:

Install docker - https://docs.docker.com/install/
Add yourself to the docker group usermod -aG docker $USER
Relogin or reboot

Gcloud

Gcloud is used to upload jobs to AI platform. You can install and configure it using these commands:

Install gcloud - https://cloud.google.com/sdk/install
gcloud auth login - authenticate with your account
gcloud config set project <project-name> - configure the project name
gcloud auth configure-docker - allows to use docker with gcloud

SEED RL

Clone the SEED RL repo:

git clone https://github.com/google-research/seed_rl.git
cd seed_rl

Changing state representation

Running locally

Analyse the experiment results

Troubleshooting

Run the experiment

Single-agent

Now it's time to actually run an experiment using AI Platform:

Note that running the below command will create numerous machines on the Google Cloud Platform. In this configuration it will incur a cost of about 170 USD in GCP money per one day of training (as of March 2020). Details about the machines assigned to particular jobs can be found here: https://console.cloud.google.com/ai-platform/jobs.

gcp/train_football_checkpoints.sh - starts training with single agent setup on SMM observations.

The above script first builds a docker image that contains code for both the learner (master) and the workers. Briefly, workers execute multiple environment instances and send observations along with rewards to the learner, which performs inference and sends back the actions. The below chart shows an overview of the architecture. More details on the algorithm can be found in the SEED paper.

Source: SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference

The image is then pushed to your GCP registry. After that the script prepares configurations for running experiments on AI Platform. It specifies types of machines and their number as well as hyper parameters like learning rate or entropy cost.

By default the training starts with 25 CPU-only workers for running environments and one learner with a GPU.

If you wish to change e.g. number of workers used in the experiment you can modify the script (seed_rl/gcp/train_football_checkpoints.sh#L24). Of course other parameters can be changed as well.

export CONFIG=football

export ENVIRONMENT=football

export AGENT=vtrace

export WORKERS=50

export ACTORS_PER_WORKER=8

The first time you run the experiment, docker image must be built which can take some time. Next runs use already created docker layers. You can check if your job was successfully uploaded by logging into the AI Platform console (the job is named SEED_<timestamp>). Setting up machines might take a few minutes. After that time in your bucket there should be the data coming from the experiment. The Google Cloud Platform bucket is named seed_rl by default.

Changing scenario

GFootball offers several scenarios. By default SEED trains on full, 11 vs 11, game against difficult bots (11_vs_11_hard_stochastic). GFootball gives us also several academy scenarios which pose challenges of increasing difficulty.

For example, in academy_3_vs_1_with_keeper scenario we control 3 players which start near the opponent's goal, guarded by a goalkeeper and one defender.

To change scenario you have to change game parameter in seed_rl/gcp/train_football_scoring.sh#L45:

    - parameterName: game

      type: CATEGORICAL

      categoricalValues:

      - academy_3_vs_1_with_keeper

You can save the changed file as train_football_3vs1_with_keeper.sh and use it to start a training.

Multi-agent

The code in SEED repository is by default not suitable for multi-agent setup. If you wish to use a multi-agent setup you need to modify the SEED repo.

Example modifications: https://github.com/Zuuja/seed_rl/tree/multiagent

Below list highlights code that you need to look at to get multi-agent setup running.

Environment creation. You need to pass number_of_players_agent_controls argument to gym.make called here: seed_rl/football/env.py
The network. If you wish to modify the network used by default, it is defined here: seed_rl/football/networks.py and used here: seed_rl/football/vtrace_main.py#L39
Packet-bit observation wrapper. It is responsible for compressing data before sending them over the network. It's good to be aware that it expects specific format of data and adds additional padding to observations: seed_rl/football/observation.py

To use the default network you need to provide it with exactly one observation and one scalar reward. GFootball environment outputs separate observation and reward for each of the controlled players. You can join them by creating wrappers like these:

class SampleMultiAgentRewardWrapper(gym.RewardWrapper):

  def __init__(self, env):

    super(SampleMultiAgentRewardWrapper, self).__init__(env)

  def reward(self, reward):

    return numpy.max(reward)

# Beware that this wrapper is probably not the best (information

# about players and ball is duplicated)

class SampleMultiAgentObservationWrapper(gym.ObservationWrapper):

  def __init__(self, env):

    super(SampleMultiAgentObservationWrapper, self).__init__(env)

    observation_shape = env.observation_space.shape[1:-1] +\

                        (env.observation_space.shape[0] *\

                         env.observation_space.shape[-1], )

    self.observation_space = gym.spaces.Box(

      low=0,

      high=255,

      shape=observation_shape,

      dtype=numpy.uint8)

  def observation(self, observation):

    return numpy.concatenate(observation, axis=-1)

You need to apply the wrappers when the environment is created (in seed_rl/football/env.py).

flags.DEFINE_integer('controlled_agents', 1, 'Number of controlled left agents')

...

def create_environment(_):

 """Returns a gym Football environment."""

 logging.info('Creating environment: %s', FLAGS.game)

 assert FLAGS.num_action_repeats == 1, 'Only action repeat of 1 is supported.'

 channel_dimensions = {

     'default': (96, 72),

     'medium': (120, 90),

     'large': (144, 108),

 }[FLAGS.smm_size]

 env = gym.make(

     'gfootball:GFootball-%s-SMM-v0' % FLAGS.game,

     stacked=True,

     rewards=FLAGS.reward_experiment,

     channel_dimensions=channel_dimensions,

     number_of_left_players_agent_controls= FLAGS.controlled_agents)

 # Beware that football network expects one scalar reward

 # and observation of shape [X,Y,L]

 env = SampleMultiAgentRewardWrapper(env)

 env = SampleMultiAgentObservationWrapper(env)

  # Beware that PackedBitsObservation expects that observation

 # consist of 255 and 0

 return observation.PackedBitsObservation(env)

Above we added the flag for the number of controlled players. You need to add it to the starting script (for example gcp/train_football_3vs1_with_keeper.sh). To control 3 players you need to add:

    - parameterName: game

      type: CATEGORICAL

      categoricalValues:

      - academy_3_vs_1_with_keeper

    - parameterName: controlled_agents

      type: INTEGER

      minValue: 3

      maxValue: 3

You can run an experiment in the same manner as it was run in the single-agent setup i.e. via gcp/train_football_3vs1_with_keeper.sh script created above.

The described modifications are present here: https://github.com/Zuuja/seed_rl/tree/multiagent

Screenshot from tensorboard after 10,5 hours of training on 3vs1 with keeper scenario:

Changing state representation

GFootball environment can provide plenty of information about the current game state e.g. position, velocity, tiredness factor and current action of each player, position, velocity, and rotation of the ball and more. The environment provides three different representations: super mini-map (SMM), simple115 (floats), and rendered game frames (pixels). Each way of encoding the data has its pros and cons e.g. one can take a little space whilst another can provide faster learning rate.

By default SEED uses the SMM representation where observations consist of several 72 by 96 planes of byte data. Each plane focuses on different aspects of the game e.g. position of right/left team or position of the ball.

Another way of representing the game state is simple115 a.k.a. floats. In this representation the whole game state is encoded in 115 floats. More detailed information about the possible game representations can be found in the GFootball documentation: gfootball/doc/observation.md.

To change the representation you need to modify the same locations as when changing to multi-agent setup - network definition and creation, env creation, and packed bits observation wrapper.

For example, if you want to adopt the float representation instead of mini-map, you can use other network (for example seed_rl/agents/vtrace/networks.py). Since floats don’t need compression you should remove the PackedBitsObservation wrapper (at seed_rl/football/env.py#L49):

return observation.PackedBitsObservation(env)

You also need to change the gym make command to change the observation format (seed_rl/football/env.py#L45).

For example to use simple115 observation (the floats observation) change this line:

'gfootball:GFootball-%s-SMM-v0' % FLAGS.game,

to:

   'gfootball:GFootball-%s-simple115-v0' % FLAGS.game,

The example of needed modifications: https://github.com/Zuuja/seed_rl/tree/floats

Screenshot from tensorboard after 9 hours of training on 3vs1 with keeper scenario:

Running locally

You can also run seed locally to, for example, test your changes before starting training on cloud. This section assumes that you are using:

Ubuntu 18.04
NVIDIA graphic card with compute capability matching current tensorflow requirements

a) You can check tensorflow requirements here

b) You can check your graphic card compute capability here

To run SEED locally you need to:

Install docker (see Prerequisites -> Docker)
Install NVIDIA drivers - https://www.tensorflow.org/install/gpu#ubuntu_1804_cuda_101
Install nvidia-docker - https://github.com/NVIDIA/nvidia-docker/blob/master/README.md#ubuntu-16041804-debian-jessiestretchbuster
Reboot
Now you can continue with seed_rl example

Analyse the experiment results

It is possible to view experiment results via tensorboard - a machine learning experiments visualisation tool:

gcloud auth application-default login - authenticate so that tensorboard have access to your gcloud bucket
tensorboard --logdir=gs://<bucket_name> - this command starts the tensorboard. It can be accessed from the browser at http://localhost:6006/ (6006 is the default port). The default bucket name used by SEED is seed_rl.

Another option is to use tensorboard.dev, which allows users to upload data to the cloud so others can see the results online.

Troubleshooting

Authorization problems

unauthorized: You don't have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in: https://cloud.google.com/container-registry/docs/advanced-authentication

If you run the script using sudo (i.e. as a different user) it is possible that you receive the error above. Running without sudo might help.

Changing the region

The gcloud compute regions list command lists the accessible regions alongside the available resources.

Changing the bucket

If you don't want your bucket to be named seed_rl (default name) you can modify the seed_rl/gcp/setup.sh file. You need to change every appearance of text seed_rl to your bucket name.

Permission denied error

seed_rl/gcp/setup.sh: line 23: seed_rl/gcp/../docker/push.sh: Permission denied

When running an experiment the error above can appear which means the seed_rl/docker/push.sh script might not have execution rights. chmod +x docker/push.sh should help.

Docker problems

We observed that on some machines there are problems with docker accessing gcloud. This command might be helpful in such scenarios gcloud auth print-access-token | docker login -u oauth2accesstoken --password-stdin https://gcr.io

Page updated

Google Sites

Report abuse

Run GFootball with SEED

Prerequisites

Git

Docker

Gcloud

SEED RL

Contents

Run the experiment

Single-agent

Changing scenario

Multi-agent

Changing state representation

Running locally

Analyse the experiment results

Troubleshooting

Authorization problems

Changing the region

Changing the bucket

Permission denied error

Docker problems