Emergent Behaviors in Mixed-Autonomy Traffic


Cathy Wu, Aboudy Kreidieh, Eugene Vinitsky, Alexandre M. Bayen

Traffic dynamics are often modeled by complex dynamical systems for which classical analysis tools can struggle to provide tractable policies used by transportation agencies and planners. In light of the introduction of automated vehicles into transportation systems, there is a new need for understanding the impacts of automation on transportation networks. The present article formulates and approaches the mixed-autonomy traffic control problem (where both automated and human-driven vehicles are present) using the powerful framework of deep reinforcement learning (RL). The resulting policies and emergent behaviors in mixed-autonomy traffic settings provide insight for the potential for automation of traffic through mixed fleets of automated and manned vehicles. Model-free learning methods are shown to naturally select policies and behaviors previously designed by model-driven approaches, such as stabilization and platooning, known to improve ring road efficiency and to even exceed a theoretical velocity limit. Remarkably, RL succeeds at maximizing velocity by effectively leveraging the structure of the human driving behavior to form an efficient vehicle spacing for an intersection network. We describe our results in the context of existing control theoretic results for stability analysis and mixed-autonomy analysis. This article additionally introduces state equivalence classes to improve the sample complexity for the learning methods.

Emergent behaviors in single-lane ring roads

Without automation, the system decays into stop-and-go traffic, where some vehicles come to a full stop. The AVs successfully stabilize the inherently unstable traffic system, and more AVs improves the system-level velocity.

Emergent behaviors in multi-lane roads

Without automation, the system is unstable as before. With a few AVs, the AVs balance themselves across the lanes and stabilize both lanes.

Emergent behaviors at intersections

Without automation, a right-of-way model (like a stop sign) results in slow traffic. A single AV learns to exploit human driving behavior to form an emergent efficient vehicle spacing behavior. With AVs, the system learns that weaving vehicles through the intersection is a more efficient policy.