Deep Latent Competition:
Learning to Race Using Visual Control Policies in Latent Space

Wilko Schwarting*, Tim Seyde*, Igor Gilitschenski*,
Lucas Liebenwein, Ryan Sander, Sertac Karaman, Daniela Rus

* equal contribution


Abstract

Learning competitive behaviors in multi-agent settings such as racing requires long-term reasoning about potential adversarial interactions. This paper presents Deep Latent Competition (DLC), a novel reinforcement learning algorithm that learns competitive visual control policies through self-play in imagination. The DLC agent imagines multi-agent interaction sequences in the compact latent space of a learned world model that combines a joint transition function with opponent viewpoint prediction. Imagined self-play reduces costly sample generation in the real world, while the latent representation enables planning to scale gracefully with observation dimensionality. We demonstrate the effectiveness of our algorithm in learning competitive behaviors on a novel multi-agent racing benchmark that requires planning from image observations.

Approach

DLC Agent

The Deep Latent Competition agent reasons about interactions in a learned latent-space. This is achieved by training a model not only to predict its own observations but also the observations of the opponent. Left to right: representation learning based on observed races; policy learning via imagined model rollouts; deployment without privileged information by leveraging opponent viewpoint prediction.

Environment

Our racing environment is a simple multi-player continuous control task. It extends Gym's car racing environment such that multiple agents can compete simultaneously. We incentivize competition through higher rewards for the agent passing a tile first. The source code is available on github.

Example Races

36.mp4
7.mp4
10.mp4
20.mp4
6.mp4
28.mp4