Imagined Potential Games: A Framework for Simulating, Learning, and Evaluating Interactive Behaviors
[Paper] [Appendix] [Code(coming soon)]
Paper Highlight
Hallway 2 agents
U-Turn 2 agents
T-Intersection 2 agents
Intersection 4 agents
Imagined Potential Game (IPG): distributed interaction generation framework that models collaborative interaction behaviors and serves as planners for reactive agents in arbitrary environments.
These reactive agents can be used to
Simulate Interactive Behaviors in different scenarios
Collaborative required scenarios like hallway, intersection...
Randomly generated scenarios.
Serve as reactive agents in simulations to help Learn to interact with collaborative closed-loop reactive agents.
Gym Env with reactive agents (not rule-based, not replay)
How to learn? Safe RL? Hierarchical RL? Inverse RL?
Evaluate the effectiveness of planners in highly interactive scenarios.
Interact with heterogeneous agents of different types.
Evaluation metrics for effective interactive navigation.
[Details on Interaction Generation] [Details on RL training]
Generate diverse, realistic interactions in interactive scenarios in a distributed way.
Examples of interactions in different scenarios are shown below. For more demos and details on resolving deadlocks (we can't fully avoid this under distributed setting), see Interaction Generation Details.
Hallway, 2 agents
Hallway, 3 agents
Hallway, 3 agents
Hallway, 4 agents
T-Intersection, 2 agents
T-Intersection, 3 agents
T-Intersection, 3 agents
T-Intersection, 4 agents
U-Turn, 2 agents
U-Turn, 2 agents
U-Turn, 2 agents
U-Turn, 2 agents
Intersection, 2 agents
Intersection, 3 agents
Intersection, 3 agents
Intersection, 4 agents
Random Obstacles, 2 agents
Random Obstacles, 2 agents
Random Obstacles, 3 agents
Random Obstacles, 3 agents
How to interact with collaborative agents that react based on your behavior? Learn to be conservative and aggressive in different cases.
Interactive RL Environment
Given different scenarios where IPG agents can effectively interact, we replace one IPG agent with a user-controlled agent (RL agent).
RL agents that succeeds in this environment needs to:
Navigate safely and efficiently.
Interact successfully with "IPG agents using different interaction parameters" representing agents with different personalities.
Interact with agents effectively in different scenarios (both interactive and non-interactive).
Hallway: IPG agent will choose to yield.
Hallway: RL agent learns to go straight since IPG agent will yield.
IPG agent stops and interacts after observing the two agents.
T-intersect: RL agent behaves similar to IPG agent.
IPG agents interact in the intersection scenario.
RL agent fails to learn an effective way interacting with multiple agents
How to evaluate interactive planner performance?
The problem for trajectory prediction and existing sim-agent metrics: 1. Data-driven(need real data) 2. Not edge cases(highly interactive).
With IPG, we can test:
We show that IPG agents can interact with these two types of agents.
The IPG agent is able to resolve deadlock via increasing the self safety radius.
The IPG can avoid collision via MPC and consistently update on estimation of others.
The robot can always choose the most conservative strategy and yield to other agents. But this is not efficient.
We propose using centralized "interaction"(there's no exact interaction since it's planned by a centralized planner) time as baseline time to compare extra time cost for interaction.
Completion Time : 6.6 sec (Baseline)
Extra Time : + 2.0 sec