Anonymous Authors
Submitted @ IEEE Conference on Games - 2025 ... Under Review ...
Abstract
Autonomous agents operating in domains such as robotics or video game simulations must adapt to changing tasks without forgetting about the previous ones. This process called Continual Reinforcement Learning poses non-trivial difficulties, from preventing catastrophic forgetting to ensuring the scalability of the approaches considered. Building on recent advances, we introduce a benchmark providing a suite of video-game navigation scenarios, thus filling a gap in the literature and capturing key challenges : catastrophic forgetting, task adaptation, and memory efficiency. We define a set of various tasks and datasets, evaluation protocols, and metrics to assess the performance of algorithms, including state-of-the-art baselines. Our benchmark is designed not only to foster reproducible research and to accelerate progress in continual reinforcement learning for gaming, but also to provide a reproducible framework for navigation methods — helping practitioners to identify and to apply effective approaches.
Visualization of Human Generated Data
Human Player Navigating a Large Maze — In this video, a human player controls a character moving through one of the large maps featured in Continual NavBench, demonstrating basic navigation behavior during gameplay.
Why Are Benchmarks Essential for Game Development ?
Benchmarks provide a common ground to objectively evaluate and compare different approaches. In video game production, benchmarks such as the Atari 2600 Suite and the Procgen Benchmark have driven advances in game AI by offering reproducible, controlled environments that mimic key aspects of gameplay. These benchmarks enable developers to quantitatively assess how well a navigation method performs, its robustness to variations, and its efficiency in terms of computational resources. In production pipelines, this objective evaluation is critical : it helps teams select and refine strategies that deliver consistent performance with optimal and controlled costs, ensuring smooth integration into game engines. Such rigorous testing is essential not only for improving player experience but also to reduce development overhead and ensuring scalability in game projects.
Continual Reinforcement Learning (CRL) enables learning agents to adapt incrementally to new tasks — such as modified maps, updated physics, or new abilities — without requiring complete retraining. This is particularly important in production pipelines, notably for video games where games often undergo frequent updates that involve topological changes (altering map layouts) or kinematic modifications (updating movement dynamics). Offline Reinforcement Learning (Offline RL) further enhances this process by leveraging large pre-collected datasets to update models without incurring the cost and risk of live data collection. By integrating Offline CRL, development teams could efficiently refine and update bots, ensuring that they remain effective over time while reducing both development overhead and downtime during updates.
How Does Continual Learning Benefit Production Pipelines ?
Human Generated Trajectories on the Small Maps.
What Makes Navigation Tasks in Virtual Worlds Unique ?
Traditional approaches to navigation in video games rely on pre-computed navigation meshes, or NavMeshes, to define walkable areas and generate paths. While navmeshes are effective in static or semi-static settings, they often require extensive manual design and frequent updates when game environments change. In contrast, AI agents trained with Reinforcement Learning could adapt to many navigation strategies and adjust, with Continual Reinforcement Learning, to evolving layouts and obstacles. This flexibility is especially valuable in modern games with Procedurally Generated levels or frequently updated maps, where static solutions may fall short. Our benchmark aims to evaluate these learned navigation methods, helping developers to propose more adaptive bots, as AI-driven approach may offer superior performance and efficiency over traditional navmesh-based techniques.