This website contains supplementary materials accompanying the ICML2020 submission.
Single-player environments (Battle and Battle2)
The videos below demonstrate the performance of agents trained with Sample Factory in several 3D environments. Videos are not sped up, the gameplay is shown at the standard framerate (35 FPS for VizDoom). The agent uses the set of controls available to the human player.
RL agent vs scripted in-game bots (8-player Deathmatch and Duel scenarios)
Unlike other videos, this is the actual game resolution used during training. Recorded this way because VizDoom does not support scripted bots in game demo files, used to record other videos. In this experiment we set frameskip to 2 instead of traditional 4 frames.
Self-Play VizDoom Duel
Agent trained using self-play and population-based training with 8 competing policies on a 36-core server with four GPUs. In this experiment we set frameskip to 2 instead of traditional 4 frames. Agents in the videos were trained on 2.5 billion environment transitions (about 18 years of VizDoom gameplay for agents in the population).
High-level architecture of Sample Factory
Timeline diagram of double-buffered sampling