9:00 - 9:20
Konstantin Yakovlev
In this talk we will elaborate on different variants of formulating multi-agent path finding (MAPF) problem, intrinsic and extrinsic assumptions, complexity of (certain variants of the) problem, and (if time permits) briefly overview the families of different methods tailored to solve (different variants of) MAPF.
9:20 - 9:55
Sven Koenig
In this talk, we will describe the advantages and disadvantages of using techniques from heuristic search for solving multi-agent pathfinding problems and then provide a high-level overview of search-based MAPF methods and the current state-of-the-art in the field.
9:55 - 10:30
Guillaume Sartoretti
Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. However, as the number of agents (robots) in the system grows, so does the combinatorial complexity of coordinating them, drastically affecting the scalability of coupled planners and the solution quality of fully decoupled methods. As a new direction for research into scalable, high-quality MAPF, my work has embraced advances in distributed reinforcement learning (RL) to let multiple agents learn decentralized, collaborative policies in a time-efficient manner. In this talk, I will first introduce our recent AI-based frameworks for MAPF (PRIMAL) and lifelong MAPF (PRIMAL2), which combine deep RL with imitation learning from a complete, optimal planner. As a result, the policies learned by PRIMAL/PRIMAL2 naturally scale to near-arbitrary team sizes while exhibiting the type of coordinated behaviors usually only associated with coupled planners. The reactive nature of these policies is also particularly well suited for robotic deployments, especially in dynamic environments, as exhibited by our numerical and experimental results. I will then discuss our recent works, which focus on explicitly combining learning-based MAPF with conventional, complete planners towards highly scalable, yet provably complete MAPF. Finally, I will touch on some of our most recent advances into learning-based MAPF, which rely on: 1) communication learning, where agents are tasked with learning a movement policy as well as a communication protocol, allowing them to identify, encode, and share relevant information about their environment/intention towards true team-level cooperation in MAPF; 2) context learning, which aims at mitigating the effect of partial observability in decentralized, AI-based MAPF; 3) social optimization, where agents learn to optimize their behaviors towards the common good, sometimes acting selfishly to break ties in symmetric situations, and sometimes selflessly to simplify each others' task.
Break 10:30 - 11:00
11:00 - 11:40
POGEMA: framework and testbed for learning-based MAPF solvers
Alex Panov, Alex Skrynnik
In this talk we will introduce POGEMA, a comprehensive software toolkit for developing learnable MAPF methods. It includes a fast environment for training, a problem instance generator, a collection of predefined instances, a visualization toolkit, and a benchmarking tool for automated evaluation. Additionally, we will also discuss selected MAPF and LifeLong MAPF algorithms that are supported by POGEMA.
11:40 - 13:00
Alex Skrynnik, Anton Andreychuk
In this section, we will demonstrate how to use the POGEMA environment and run algorithms within it. We will cover how to create visualizations, including animated SVGs and vectorized illustrations for papers. We will also show how to conduct evaluations, save raw results, and use them to create high-quality plots.
Additionally, we will explain how to specify an evaluation protocol and compute results from a range of domain-specific metrics, enabling fair multi-dimensional comparisons. We will also discuss how to scale evaluations by parallelizing them across servers or multiple nodes using Dask. Finally, we will demonstrate how to adapt existing approaches to work within the POGEMA environment.
The End