Goal-Aware Prediction (GAP)
GAP is a model-based RL algorithm for solving visuomotor tasks without rewards.
Motivating Problem: Learned dynamics models have to capture everything about a scene, even components irrelevant to task.
Our Solution: Learn a dynamics model conditioned on goals, that only captures goal relevant components of the scene.
We test GAP on a set of simulated manipulation tasks.
GAP redistributes errors to be more accurate on goal relevant states
Doing so leads to improved task performance on visuomotor tasks
GAP can be combined with video prediction models to scale to complex visual scenes.
Our analysis suggests model accuracy on the best trajectories is most important for planning performance. Does GAP improve accuracy on the best trajectories?
Yes it does!
While having similar performance over all trajectories, when considering better and better trajectories, GAP has significantly lower prediction error than standard dynamics models
We observe this effect qualitatively as well, where GAP predicts more accurately on task relevant objects, but does not model irrelevant objects
By distributing model error to be accurate on the best states, does GAP improve task performance when used with planning?
Yes it does!
GAP outperforms standard latent dynamics models, as well as model free self-supervised approaches (RIG), and a latent dynamics model learned via an inverse model loss.
GAP also outperforms ablations in most tasks, suggesting that both goal conditioning and residual prediction are important for task performance.
Can GAP be combined with video prediction models to scale to cluttered visual scenes?
Yes it can!
When combined with SVG, GAP has lower test error on goal reaching trajectories, and visibly captures goal relevant objects more effectively.