Differentiable SLAM-net

Peter Karkus Shaojun Cai David Hsu

National University of Singapore

Simultaneous localization and mapping (SLAM) remains challenging for a number of downstream applications, such as visual robot navigation, because of rapid turns, featureless walls, and poor video quality. We introduce the Particle SLAM Network (SLAM-net) along with a navigation architecture to enable planar robot navigation in previously unseen indoor environments. SLAM-net encodes a particle filter based SLAM algorithm in a differentiable computation graph, and learns task-oriented neural network components by backpropagating through the particle filter algorithm. Because of its ability to optimize all model components jointly for the end-objective, SLAM-net learns to be robust in challenging conditions. We run experiments in the Habitat platform with different real-world RGB and RGB-D datasets. SLAM-net significantly outperforms the widely adapted ORB-SLAM in noisy conditions. Our navigation architecture with SLAM-net improves the state-of-the-art for the Habitat Challenge 2020 PointNav task by a large margin (37% to 64 % success).

Conference paper:
P. Karkus, S. Cai, D. Hsu. Differentiable SLAM-net: Learning Particle SLAM for Visual Navigation, CVPR, 2021

Workshop paper:

P. Karkus, S. Cai, D. Hsu. Differentiable SLAM-nets: Learning Task-Oriented SLAM for Visual Navigation, 3rd Robot Learning Workshop, NeurIPS, 2020

Code: TBA

Spotlight talk

SLAM is difficult for downstream robot navigation

SLAM can be challenging when connected to a downstream robot navigation task

robot faces featureless walls
robot rotates rapidly
camera is noisy
frame rate is low

Images taken by our navigation agent in Habitat.

Differentiable SLAM-net

Differentiable SLAM-net is a novel differentiable SLAM architecture. It encodes a particle filter based SLAM algorithm and its associated models in a differentiable computation graph. The models are neural network components, and they are trained jointly end-to-end by backpropagating gradients through the SLAM pipeline. Our differentiable implementation is based on the Particle Filter Network and the Differentiable Mapping Network.

Visual navigation architecture with SLAM-net

We propose a visual navigation architecture that connects SLAM-net to a weighted D* path planner and a local subgoal controller. SLAM-net estimates a 2D occupancy map and the planar robot pose on the map. The path planner plans a path from the current pose to the goal. The local controller tracks the path and outputs (discrete) robot actions.

Experiments

We perform experiments in the Habitat simulator where a robot navigates in previously unseen indoor environments given noisy RGB or RGB-D camera.

SLAM results. The table reports success rate (SR) and localization error (RMSE) for different trajectories and input modalities (RGBD or RGB only). SLAM-net successfully learns to localize. It outperforms a learned visual odometry model; classic FastSLAM, the same algorithm as in SLAM-net but with handcrafted model components; as well as the classic ORB-SLAM algorithm. ORB-SLAM fails entirely. We find that ORB-SLAM works well only under ideal conditions, without observation noise, action noise, and high frame rate; however, removing only one of these itself is not sufficient for good performance.

Transfer. A model trained with the Gibson dataset directly transfers to unseen environments from the Replica and Matterport datasets.

Navigation. Our navigation agent with SLAM-net improves the state-of-the-art on the CVPR Habitat Challenge 2020 PointNav task by a large margin. Leaderboard was captured on Nov. 16, 2020.