Sec1: Gradient Field Visualisations (State-based)
Sec2: Task Demonstrations
Navigation (State)
Tracking (State)
Navigation (Image)
Tracking (Image)
Circling
Clustering
Circling + Clustering
Room Rearrangement
Sec3: Failure Cases: Local Minimum Issue
Sec4: Tackling The Local Minimum Issue: DualGF (PRM)
Inference pipeline of DualGF (PRM). We use rejection sampling to sample points from the free space where a point is rejected when its support energy is lower than a threshold. We select the point with the largest target energy as the goal point. Given the roadmap and the selected goal, we can plan the shortest path from the initial point to the goal via the Dijkstra Algorithm.
Episode 1
Episode 1
Episode 2
Episode 2
Episode 3
Episode 3
Sec5: Mitigating the Instability of Lagrangian Update
(a) lambda_t increases too fast
(b) lambda_t drops too fast
(c) lambda_t is properly updated
Sec6: OOD Test on Navigation
Table 2: Quantitative results. Performance of DualGF on key metric (i.e., TRF) does not significantly drop as number of blocks increases.
Support Gradient Field
5 blocks
An example episode
5 blocks
Support Gradient Field
6 blocks
An example episode
6 blocks
Sec7: RCE(State) works better than RCE(Image)
RCE, Tracking (State)
Chasing the goal.
High collision.
RCE, Tracking (State)
Chasing the goal.
High collision.
RCE, Tracking (Image)
Collapsed policy.
Low collision.
RCE, Tracking (Image)
Collapsed policy.
Low collision.
Sec8: Comparison between DualGF and TarGF
Figure 1: Here we only illustrate the performance of DualGF, TarGF(SAC) and TarGF(ORCA) in Ball and Room Rearrangement. DualGF significantly outperforms TarGF(SAC) in PL and achieves comparable results with the upperbound TarGF(ORCA).