Trajectory-Constrained Deep Latent Visual Attention for Improved Local Planning in Presence of Heterogeneous Terrain

Stefan Wapnick, Travis Manderson, David Meger, Gregory Dudek

McGill University Mobile Robotics Lab, Montreal, Quebec, Canada

Published in IROS 2021

Paper: https://arxiv.org/abs/2112.04684

Abstract

We present a reward-predictive, model-based learning method featuring trajectory-constrained visual attention for use in mapless, local visual navigation tasks. Our method learns to place visual attention at locations in latent image space which follow trajectories caused by vehicle control actions to later enhance predictive accuracy during planning. Our attention model is jointly optimized by the task-specific loss and additional trajectory-constraint loss, allowing adaptability yet encouraging a regularized structure for improved generalization and reliability. Importantly, visual attention is applied in latent feature map space instead of raw image space to promote efficient planning. We validated our model in visual navigation tasks of planning low turbulence, collision-free trajectories in off-road settings and hill climbing with locking differentials in the presence of slippery terrain. Experiments involved randomized procedural generated simulation and real-world environments. We found our method improved generalization and learning efficiency when compared to no-attention and self-attention alternatives.

Fig .1: Our method learns visual attention locations in latent space (seen projected back into image space here) which align with trajectories implied by vehicle control actions and then uses this learned attention model to better predict candidate trajectory rewards during planning.

Fig 2. Ground-truth trajectories (white) and predicted attention masks (superimposed over all timesteps) for various experiments comparing self-attention (top) and trajectory-constrained attention (bottom). The self-attention variant generally follows the trajectory but sometimes focuses on background features leading to diminished generalization performance.

Fig 3. Sample high-scoring paths output by trained local planner for experiments involving following low turbulence, collision-free terrain (first two images) and driving up a slope in the presence of slippery terrain with selectable locking differentials.

Short Spotlight Video Presentation:

Long-Form Video Presentation:

Citation

@INPROCEEDINGS{wapnick2021iros,

author={Wapnick, Stefan and Manderson, Travis and Meger, David and Dudek, Gregory},

booktitle={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},

title={Trajectory-Constrained Deep Latent Visual Attention for Improved Local Planning in Presence of Heterogeneous Terrain}, year={2021},

volume={},

number={},

pages={460-467},

doi={10.1109/IROS51168.2021.9636422}

}