Varun Ravi Kumar*, Stefan Milz*, Christian Witt*, Martin Simon*, Karl Amende*, Johannes Petzold*, Senthil Yogamani' and Timo Pech''
*Valeo DAR Kronach, Germany, 'Valeo Vision Systems, Ireland and ''Technical Universität Chemnitz, Germany
Near-field depth estimation around a self-driving car is an important function that can be achieved by four wide-angle fisheye cameras having a field of view of over 180◦ . Depth estimation based on convolutional neural networks (CNNs) produce state of the art results, but progress is hindered because depth annotation cannot be obtained manually. Synthetic datasets are commonly used but they have limitations. For instance, they do not capture the extensive variability in the appearance of objects like vehicles present in real datasets. There is also a domain shift while performing inference on natural images illustrated by many attempts to handle the domain adaptation explicitly. In this work, we explore an alternate approach of training using sparse LiDAR data as ground truth for depth estimation for fisheye camera. We built our own dataset using our self-driving car setup which has a 64-beam Velodyne LiDAR and four wide angle fisheye cameras. To handle the difference in view-points of LiDAR and fisheye camera, an occlusion resolution mechanism was implemented. We started with Eigen’s multiscale convolutional network architecture [1] and improved by modifying activation function and optimizer. We obtained promising results on our dataset with RMSE errors comparable to the state-of-the-art results obtained on KITTI.
Even though the camera/LiDAR setups are different, the results provide a reasonable comparison to KITTI on performance of monocular depth regression using sparse LiDAR input. In future work, we aim to improve the results by using more consecutive frames which can exploit the motion parallax and better CNN encoders. We also plan to augment the supervised training with synthetic data and unsupervised training techniques.
@inproceedings{ravikumar2018monocular,
title = {{Monocular fisheye camera depth estimation using sparse lidar supervision}},
author = {Ravi Kumar, Varun and Milz, Stefan and Witt, Christian and Simon, Martin and Amende, Karl and Petzold, Johannes and Yogamani, Senthil and Pech, Timo},
year = 2018,
booktitle = {2018 21st International Conference on Intelligent Transportation Systems (ITSC)},
pages = {2853--2858},
organization = {IEEE}
}