Cross-Modal Stereo by Using Kinect 

Wei-Chen Chiu1, Ulf Blanke1, and Mario Fritz1

British Machine Vision Conference (BMVC) 2011
Workshop on Consumer Depth Cameras for Computer Vision (CDC4CV) 2011



Brief Introduction
Kinect's active sensing strategy is very well suited to produce robust and high-frame rate depth maps for human pose estimation. But the shift to the robotics domain surfaced applications under a wider set of operation condition it wasn’t originally designed for. We see the sensor fail completely on transparent and specular surfaces which are very common to every day household objects.
    We complement the depth estimate within the Kinect by a cross-modal stereo path that we obtain from disparity matching between the included IR and RGB sensor of the Kinect. We investigate how the RGB channels can be combined optimally in order to mimic the image response of the IR sensor. Our combination method produces depth maps that include sufficient evidence for reflective and transparent objects, and preserves at the same time textureless objects, such as tables or a walls. [See our BMVC'11 paper for details.]
    However, the method is troubled by interference from the IR projector that is required for the active depth sensing method. We investigate these issues and conduct a more detailed study of the physical characteristics of the sensors. Adapting RGB in frequency domain to mimic an IR image did not yield improved performance. We further propose a more general method that learns optimal filters for cross-modal stereo under projected patterns. From the experimental results we conclude that our pre-filtered, cross-modal, SAD-based stereo vision algorithm profits most from combination in the spatial domain, rather than in the frequency domain.

Source Code, Dataset

References
Wei-Chen Chiu, Ulf Blanke, and Mario Fritz, "I spy with my little eye: Learning Optimal Filters for Cross-Modal Stereo under Projected Patterns," in 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision (CDC4CV) in conjunction with Int’l Conf.  on Computer Vision (ICCV), Barcelona, Spain, Nov 12, 2011. (Poster)

Wei-Chen Chiu, Ulf Blanke, and Mario Fritz, "Improving the Kinect by Cr0ss-Modal Stereo," in 22nd British Machine Vision Conference (BMVC), Dundee, UK, Aug 29-Sept 02, 2011(Poster)

Bibtex
CDC4CV:
@inproceedings{chiu2011spy,
  Author    = {Chiu, W.C. and Blanke, U. and Fritz, M.},
  Title     = {I spy with my little eye: 
               Learning optimal filters for cross-modal stereo under projected patterns},
  Booktitle = {{IEEE International Conference on Computer Vision (ICCV) Workshops}},
  Pages     = {1209--1214},
  Year      = {2011},
  Location  = {Barcelona, Spain}
}

BMVC:
@inproceedings{chiu2011improving,
  Author    = {Chiu, W.C. and Blanke, U. and Fritz, M.},
  Title     = {Improving the kinect by cross-modal stereo},
  Booktitle = {Proceedings of the British Machine Vision Conference (BMVC)},
  Year      = {2011},
  Location  = {Dundee, UK}
}