Hybrid Stereo

Binocular Stereo from Dual-modality cameras

In this work, we propose an unsupervised visible light(VIS)-image-guided cross-spectrum (i.e., thermal and visible-light, TIR-VIS in short) depth-estimation framework. The input of the framework consists of a cross-spectrum stereo pair (one VIS image and one thermal image). First, we train a base depth-estimation network using VIS-image stereo pairs. To adapt the trained depth-estimation network to the cross-spectrum images, we propose a multi-scale feature-transfer network to transfer features from the TIR domain to the VIS domain at the feature level. Furthermore, we introduce a mechanism of cross-spectrum depth cycle consistency to improve the depth-estimation result of dual-spectrum image pairs. Meanwhile, we release a large dual-spectrum dataset with visible-light and thermal stereo images captured in different scenes to the society. The experiment result shows that our method achieves better depth-estimation results than the compared existing methods. Our dataset is available on \url{https://github.com/whitecrow1027/TIR-VIS-Datasets}.

Reference paper: Yubin Guo, Xinlei Qi, Jin Xie, Cheng-Zhong Xu, and Hui Kong, Cross-Spectrum Unsupervised Depth Estimation by Visible-light and Thermal Cameras, IEEE Transactions on Intelligent Transportation Systems, 2023