Flow: A Modular Learning Framework for Autonomy in Traffic

Authors: Cathy Wu, Aboudy Kreidieh, Kanaad Parvate, Eugene Vinitsky, and Alexandre Bayen

The rapid development of autonomous vehicles (AVs) holds vast potential for transportation systems through improved safety, efficiency, and access to mobility. However, new methodologies are needed for the design of vehicles and transportation systems to enable these positive outcomes, due to numerous technical, political, and human factors challenges. This article focuses on tackling important technical challenges arising from the partial adoption of autonomy (hence termed mixed autonomy, to involve both AVs and human-driven vehicles): partial control and partial observation, complex dynamics of multi-vehicle interactions, and the sheer variety of traffic settings represented by real-world networks. To enable the study of the full diversity of traffic settings, we first propose decomposing traffic control tasks into components called modules, which may be configured and composed to create new control tasks of interest. These modules include salient aspects of traffic control tasks: networks, actors, control laws, metrics, initialization, and additional dynamics. Second, we study the potential of model- free deep Reinforcement Learning (RL) methods to address the complexity of traffic dynamics. The resulting modular learning framework is called Flow. Using Flow, we create and study a variety of mixed-autonomy settings, including single-lane, multi- lane, and intersection traffic In all cases, the learned control law exceeds human driving performance (measured by system-level average velocity) by at least 40% with only 5-10% adoption of AVs. In the case of partially-observed single-lane traffic, we show that a low-parameter neural network control law can eliminate commonly observed stop-and-go waves in traffic. In particular, the control laws surpass all known model-based controllers to achieve near-optimal performance across a wide spectrum of vehicle densities (even with a memoryless control law) and additionally generalize to out-of-distribution vehicle densities.

Flow is open-source and available at: https://github.com/flow-project/flow

Documentation: https://flow.readthedocs.io/

Stabilizing a single-lane ring

Here we see a simulation of the famous experiment by Sugiyama et al., in which 22 vehicles in a 230m ring road lead to instabilities called "stop-ang-go-waves". This reproduced result is followed by 4 different control law (2 learned, 2 model-based control laws from the literature), which replace 1 of the vehicles with a controlled vehicle (AV). Some of the videos are shown for 260m (the length at which the explicit controllers are calibrated) and others show rollouts with varying densities (different lengths). The GRU control law performs the best overall, and it is able to stabilize ring sizes even outside of the training regime.

Platooning

Flow can be used to create new vehicle tasks and improve network performance. Below, a sequence of adjacent autonomous vehicles learn to platoon together to improve the average velocity of the human-driven vehicles.

Figure 8

Autonomous vehicles can be trained to handle various networks as well. Below, the loop is augmented with an intersection creating a "figure 8" network. In this network, autonomous vehicles learn to handle the added intersection by bunching all vehicles together and even weaving the autonomous vehicles at the intersection when all vehicles are autonomous.