DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation Using Monocular Camera and Sparse LiDAR


Rishav*(1,2) Ramy Battrawy*(1) René Schuster(1) Oliver Wasenmüller(1,4) Didier Stricker(1,3)



*Equal Contribution

(1) German Research Center for Artificial Intelligence - DFKI, Kaiserslautern, Germany

(2) Birla Institute of Technology and Science - BITS Pilani, Pilani, India

(3) University of Kaiserslautern - TUK, Kaiserslautern, Germany

(4) University of Applied Sciences Mannheim



Accepted at IROS 2020


Abstract

Scene flow is the dense 3D reconstruction of motion and geometry of a scene. Most state-of-the-art methods use a pair of stereo images as input for full scene reconstruction. These methods depend a lot on the quality of the RGB images and perform poorly in regions with reflective objects, shadows, ill-conditioned light environment and so on. LiDAR measurements are much less sensitive to the aforementioned conditions but LiDAR features are in general unsuitable for matching tasks due to their sparse nature. Hence, using both LiDAR and RGB can potentially overcome the individual disadvantages of each sensor by mutual improvement and yield robust features which can improve the matching process. In this paper, we present DeepLiDARFlow, a novel deep learning architecture which fuses high level RGB and LiDAR features at multiple scales in a monocular setup to predict dense scene flow. Its performance is much better in the critical regions where image-only and LiDAR-only methods are inaccurate. We verify our DeepLiDARFlow using the established data sets KITTI and FlyingThings3D and we show strong robustness compared to several state-of-the-art methods which used other input modalities. The code of our paper is available at https: //github.com/dfki-av/DeepLiDARFlow.



Conventional Scene Flow Approach Our DeepLiDARFlow

We introduce our DeepLiDARFlow, a novel deep learning architecture which fuses a monocular image and the corresponding sparse LiDAR measurements (shown as green spots on input image) for dense scene flow estimation. For very sparse LiDAR (∼100 points), our DeepLiDARFlow outperforms comfortably the conventional scene flow approach which employs such a fusion.

Citation

@inproceedings{DeepLiDARFlow2020,

title={{DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation Using Monocular Camera and Sparse LiDAR}},

author={Rishav and Battrawy, Ramy and Schuster, Ren{\'e} and Wasenm{\"u}ller, Oliver and Stricker, Didier},

booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},

year={2020},

}