Active Hierarchical Exploration
with Stable Subgoal Representation Learning
- Subgoal Representation Learning
The following videos show the subgoal representation learning process in the Ant Maze (Images) task. Each frame contains 2D state embedding of 5 trajectories in a U-shape maze (red for the start, blue for the end). Notably, the representations are learned from low-resolution images.
Ant Maze task
(a) HESS
(b) w/o stability regularization
2. Hierarchical Policy Learning
Along with the stable subgoal representation learning, we develop an active hierarchical exploration strategy to seek out novel and promising latent states. To the best of our knowledge, our method is the first to support the concurrent learning of the hierarchical policies and the subgoal representation in long-horizon continuous-control tasks with sparse rewards. The small red squares denote subgoals in the representation space.
Ant Maze
Ant Maze (Images)
Ant Push
Ant Push (Images)
Ant FourRooms
Point Maze
Cheetah Hurdle
Cheetah Ascending