25 February 2025 | Sala Stringa | 11:30 | Sergio Mover (Computer Science Department of École Polytechnique)
Abstract
Hierarchical Reinforcement Learning (HRL) decomposes the decision-making process in a hierarchy of agents where higher-level agents solve more abstract tasks (e.g., navigate the robot in a maze), and lower-level agents solve more control-related tasks (e.g., moving the robot in a specific direction). While HRL has been applied to solve complex tasks in continuous environments (e.g., from robotics applications), an existing challenge is that the state space the high-level agent operates (also called goal representation) is not know a priori.
We tackle the problem of learning an effective goal representation for HRL. The main challenges we solve are learning a goal representation that captures the underlying and unknown environment dynamics, learning such representation together with the reinforcement learning policies (i.e., "online"), and scaling the HRL algorithm to high dimensional continuous environments.
In our solution, the HRL algorithm incrementally learns a goal representation as an abstraction partitioning the continuous state space. We define an abstraction with respect to the reachability relation of the agent and show that, under some assumptions, this choice guarantees the existence of a bound on the number of refinement steps of the abstraction and enables learning a policy with bounded performance guarantees.
On the practical side, we propose an HRL algorithm that learns simultaneously the hierarchical policies and a the abstraction (the goal representation). Technically, the algorithm approximates the agent's reachability relation as a neural network and uses set-based reachability analysis for computing a refinement. We further show that the approach scales to complex robotics environments from the Mujoco simulator.
The talk presents joint work with Mehdi Zadem (LIX) and Sao Mai Nguyen (Ensta Paris) from ICDL 2023 and ICLR 2024.
Bio
Sergio Mover is a Professor in the Computer Science Department of École Polytechnique and a member of the Cosynus team at LIX (Computer Science Laboratory of École Polytechnique). Before, he was a postdoctoral researcher at the University of Colorado Boulder and he obtained a Ph.D. in Computer Science from the University of Trento in 2014. His research focuses on formal methods, in particular model checking for hybrid and cyber-physical systems, and program analysis.