Realistic simulation of traffic agents is a critical yet underexplored component of autonomous driving research. While existing simulators excel at rendering high-fidelity sensor data, the plausibility, diversity, and controllability of agent behaviors remain limited—especially in long-horizon, closed-loop multi-agent interactions.
We present HMSim, a hierarchical multi-agent behavioral simulator designed to address these challenges. HMSim explicitly separates behavior simulation into two complementary levels:
High-level intention modeling, which jointly predicts long-term goals for all agents in a scene, enabling diverse and globally consistent behaviors.
Low-level policy control, which uses reinforcement learning to execute these intentions in a reactive, closed-loop manner, ensuring safety, physical realism, and robustness to out-of-distribution interactions. This hierarchical design allows HMSim to promote behavior diversity without sacrificing controllability.
By modeling high-level intentions probabilistically and delegating short-term decision making to learned RL policies, HMSim can generate a wide range of realistic behaviors—including rare and sub-optimal ones—while maintaining stable long-term rollouts.
Paper link:
ICRA'26 (Coming soon)
Our experiments show superior performance in simulation accuracy, diversity and plausibility.
We visualize our resimulated scenarios from waymo datasets.
We use our hierarchical policy to control the behaviors of agents, and train ego-vehicle policy in gynamisum environment. The scenario is purely generated from scratch, without relying on any existing datasets.