Robot Localisation and Mapping

This paper proposes a novel approach for global localisation of mobile robots in large-scale environments. Our proposed method leverages learning-based localisation and filtering-based localisation, to localise the robot efficiently and precisely through seeding Monte Carlo Localisation (MCL) with deep learned distribution. In particular, a fast localisation system rapidly estimates the 6-DOF pose through a deep-probabilistic model (Gaussian Process Regression with deep kernel), then a precise recursive estimator refines the estimated robot pose according to the geometric alignment. More importantly, the Gaussian method (i.e. deep probabilistic localisation) and non-Gaussian method (i.e. MCL) can be integrated naturally via importance sampling. Consequently, the two systems can be integrated seamlessly and mutually benefit from each other. To verify the proposed framework, we provide a case study in large-scale localisation with a 3D lidar sensor. Our experiments on the Michigan NCLT long-term dataset show that the proposed method is able to localise the robot in 1.94 s on average (median of 0.8 s) with a precision of 0.75 m in a large-scale environment of approximately 0.5 km2.

Li Sun, Daniel Adolfsson, Martin Magnusson, Henrik Andreasson, Ingmar Posner, and Tom Duckett. "Localising Faster: Efficient and precise lidar-based robot localisation in large-scale environments", ICRA2020. [PDF]

This research presents a novel semantic mapping approach, Recurrent-OctoMap, learned from long-term three-dimensional (3-D) Lidar data. Most existing semantic mapping approaches focus on improving semantic understanding of single frames, rather than 3-D refinement of semantic maps (i.e. fusing semantic observations). The most widely used approach for the 3-D semantic map refinement is “Bayesian update,” which fuses the consecutive predictive probabilities following a Markov-chain model. Instead, we propose a learning approach to fuse the semantic features, rather than simply fusing predictions from a classifier. In our approach, we represent and maintain our 3-D map as an OctoMap, and model each cell as a recurrent neural network, to obtain a Recurrent-OctoMap. In this case, the semantic mapping process can be formulated as a sequence-to-sequence encoding-decoding problem. Moreover, in order to extend the duration of observations in our Recurrent-OctoMap, we developed a robust 3-D localization and mapping system for successively mapping a dynamic environment using more than two weeks of data, and the system can be trained and deployed with arbitrary memory length. We validate our approach on the ETH long-term 3-D Lidar dataset. The experimental results show that our proposed approach outperforms the conventional “Bayesian update” approach.

Li Sun, Zhi Yan, Anestis Zaganidis, Cheng Zhao, Tom Duckett. “Recurrent-OctoMap: Learning Feature Fusion for Long-term 3D-Lidar-based Semantic Mapping”, IEEE Robotics and Automation Letters RA-L.

This paper introduces a fully deep learning approach to monocular SLAM, which can perform simultaneous localization using a neural network for learning visual odometry (L-VO) and dense 3D mapping. Dense 2D flow and a depth image are generated from monocular images by sub-networks, which are then used by a 3D flow associated layer in the L-VO network to generate dense 3D flow. Given this 3D flow, the dual-stream L-VO network can then predict the 6DOF relative pose and furthermore reconstruct the vehicle trajectory. In order to learn the correlation between motion directions, the Bivariate Gaussian modelling is employed in the loss function. The L-VO network achieves an overall performance of 2.68% for average translational error and 0.0143 deg/m for average rotational error on the KITTI odometry benchmark. Moreover, the learned depth is fully leveraged to generate a dense 3D map. As a result, an entire visual SLAM system, that is, learning monocular odometry combined with dense 3D mapping, is achieved.

Cheng Zhao, Li Sun, Pulak Purkait, Tom Duckett, Rustam Stolkin. “Learning Monocular Visual Odometry with Dense 3D Mapping using an End-to-end Network”, International Conference of Robotics and Automation (IROS) 2018.

Point cloud registration is the task of aligning 3D scans of the same environment captured from different poses. When semantic information is available for the points, it can be used as a prior in the search for correspondences to improve registration. Semantic-assisted Normal Distributions Transform (SENDT) is a new registration algorithm that reduces the complexity of the problem by using the semantic information to partition the point cloud into a set of normal distributions, which are then registered separately. In this letter we extend the NDT registration pipeline by using PointNet, a deep neural network for segmentation and classification of point clouds, to learn and predict per-point semantic labels. We also present the Iterative Closest Point (ICP) equivalent of the algorithm, a special case of Multichannel Generalized ICP. We evaluate the performance of SE-NDT against the state of the art in point cloud registration on the publicly available classification data set Semantic3d.net. We also test the trained classifier and algorithms on dynamic scenes, using a sequence from the public dataset KITTI. The experiments demonstrate the improvement of the registration in terms of robustness, precision and speed, across a range of initial registration errors, thanks to the inclusion of semantic information.


Anestis Zaganidis, Li Sun, Tom Duckett, Grzegorz Cielniak. “Integrating semantic knowledge into 3D Normal Distributions Transform registration”. IEEE Robotics and Automation Letters RA-L.