Ce Hao, Anxing Xiao, Zhiwei Xue, Harold Soh
National University of Singapore
Abstract
Diffusion-based planners have shown strong performance in short-horizon tasks but often fail in complex, long-horizon settings. We trace the failure to loose coupling between high-level (HL) sub-goal selection and low-level (LL) trajectory generation, which leads to incoherent plans and degraded performance. We propose Coupled Hierarchical Diffusion(CHD), a framework that models HL sub-goals and LL trajectories jointly within a unified diffusion process. A shared classifier passes LL feedback upstream so that sub-goals self-correct while sampling proceeds. This tight HL–LL coupling improves trajectory coherence and enables scalable long-horizon diffusion planning. Experiments across maze navigation, tabletop manipulation, and household environments show that CHD consistently outperforms both flat and hierarchical diffusion baselines.
Method
Long-horizon task planning in robotics is challenging for diffusion models due to weak coordination between high-level subgoal planning and low-level trajectory generation.
Introduced a coupled hierarchical diffusion (CHD) framework that enables joint generation and iterative refinement of subgoals and trajectories within a unified diffusion process.
The joint diffusion process tightly couples high-level subgoal selection and low-level trajectory generation, allowing for mutual feedback and online subgoal correction during planning.
Hierarchical classifier-guided feedback mechanism and asynchronous parallel generation strategy, enabling efficient and scalable planning across long-horizon tasks.
Experiment
☆are intermediate subgoals
Long-horizon trajectory planning in maze.
Given start and goal position, diffusion planners generate the intermediate (HL) subgoals and (LL) trajectories.
Baseline hierarchical diffusion (BHD) plans suboptimal paths, while coupled hierarchical diffusion (CHD) aligns both levels to achieve better performance.
Real-robot demonstration
Tasks: 1) Drop bag into trash can; 2) Serve hamburger on coffee table;
3) Store sponge into cabinet; 4) Put milk box into fridge.
Offline task planning: coupled hierarchical diffusion planner
Downstream policies: grasp pose diffusion, MoveIt, ACT