Abstract
Multi-agent learning algorithms have been successful at generating superhuman planning in various games but have had limited impact on the design of deployed multi-agent planners. A key bottleneck in applying these techniques to multi-agent planning is that they require billions of steps of experience. To enable the study of multi-agent planning at scale, we present GPUDrive, a GPU-accelerated, multi-agent simulator built on top of the Madrona Game Engine that can generate over a million simulation steps per second. Observation, reward, and dynamics functions are written directly in C++, allowing users to define complex, heterogeneous agent behaviors that are lowered to high-performance CUDA. We show that using GPUDrive we can effectively train reinforcement learning agents over many scenes in the Waymo Open Motion Dataset, yielding highly effective goal-reaching agents in minutes for individual scenes and enabling agents to navigate thousands of scenarios within hours.
GPUDrive runs at over 1 million FPS, making large scale multi-agent planning accessible on small compute budgets
GPUDrive supports different observation spaces: Radial filter and LiDAR
GPUDrive supports a diverse set of multi-agent traffic scenarios from the Waymo Open Motion Dataset (N = 103,354)
Here we demonstrate the behavior of a pre-trained policy trained in 1000 scenarios to control all the vehicles in a scene:
Use the sim through the available Python bindings and gym environments in torch and jax
from pygpudrive.env.env_torch import GPUDriveTorchEnv
# Make env
env = GPUDriveTorchEnv(
config=env_config,
scene_config=scene_config,
max_cont_agents=128,
device="cuda",
render_config=render_config,
action_type="continuous" # or discrete
)
# Step through hundreds of scenarios in parallel
obs = env.reset()
for t in range(env_config.episode_len):
actions = env.action_space.sample()
env.step_dynamics(actions)
obs = env.get_obs()
info = env.get_infos()
dones = env.get_dones()
if dones.all()
break