Description
Finding visual correspondence across images is the cornerstone in numerous multimedia applications. Correspondence maps serve as key building blocks for numerous high-level applications, including autonomous driving (using stereo matching and optical flow), computational photography, location based services (using camera localization), and video surveillances (using scene recognition).
Visual correspondence methods aim to find a set of matched pixels between two or multiple images. In general, its performance relies primarily on two components: image descriptors and optimization algorithms. Traditional approaches for estimating depth or optical flow fields have been dramatically advanced, as observed in several benchmarks. Recently, several efforts have also been made to establish correspondences across multiple images that are captured for different 3D scene yet having semantically similar appearances, and/or that are captured with large visual disparities, significant illumination changes, and images taken with different sensor sensitivities.
In this tutorial, we will cover various fundamental techniques that have been developed to design correspondence algorithms. We will first introduce state-of-the-arts local image descriptors, which measure a matching fidelity across multiple images. More comprehensive overview will be given by categorizing these approaches according to correspondence density, invariance against photometric and geometric variations, and robustness against different sensor sensitivities.
We will then cover recent research works in labeling optimization techniques. In particular, we focus on the serious computational challenge and labeling accuracy of discrete labeling optimization techniques. Cost volume filtering-based approaches and global optimization approaches will be introduced, which effectively deal with the huge discrete label space and/or the high-order Markov Random Field (CRF) model by making use of efficient filtering algorithms and a smart randomized search idea.
Finally, we will introduce numerous exciting applications relying on visual correspondence algorithms, including scene understanding, robot navigation, computational photography, and 3-D scene reconstruction. We will present the key ideas of these latest exciting applications, while highlighting the essential roles that visual correspondences are playing.
Contact
Dr. Dongbo Min: dbmin (at) cnu (dot) ac (dot) kr, dbmin99 (at) gmail (dot) com
Dr. Jiangbo Lu: jiangbo (dot) lu (at) adsc (dot) com (dot) sg
Course Material
Part 1: Introduction and descriptor (ppt, pdf)
Part 2: Regularizing the estimates: labeling optimization (ppt, pdf)
Part 3: Applications (ppt, pdf)
Code
* Descriptor
DASC: Dense Descriptor for Multi-modal and Multi-spectral Matching (CVPR 2015): webpage
* Labeling optimization
DAISY Filter Flow (CVPR 2014): webpage
(More codes will be updated!)
Bibliography
- NEW: J. Lu, Y. Li, H. Yang, D. Min, W. Eng, and M. N. Do, “PatchMatch Filter: Edge-Aware Filtering Meets Randomized Search for Correspondence Field Estimation,” IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 2016.
- NEW: W.-Y. Lin, S. Liu, N. Jiang, M. N. Do, P. Tan, and J. Lu, “RepMatch: Robust Feature Matching and Pose for Reconstructing Modern Cities,” European Conf. Computer Vision (ECCV), 2016.
- NEW: Y. Li, D. Min, M. N. Do, and J. Lu, “Fast Guided Global Interpolation for Depth and Motion,” European Conf. Computer Vision (ECCV), 2016. (spotlight)
- NEW: Y. Li, D. Min, M. S. Brown, M. N. Do, and J. Lu, "SPM-BP: Sped-up PatchMatch Belief Propagation for Continuous MRFS," ICCV 2015 (Oral).
- S. Kim, D. Min, B. Ham, S. Ryu, M. N. Do, and K. Sohn, “DASC: Dense Adaptive Self-Correlation Descriptor for Multi-modal and Multi-spectral Correspondence,” CVPR 2015.
- J. Lu, H. Yang, D. Min, and M. N. Do, “PatchMatch Filter: Efficient Edge-Aware Filtering Meets Randomized Search for Fast Correspondence Field Estimation,” CVPR 2013.
- H. Yang, W.-Y. Lin, and J. Lu, “DAISY Filter Flow: A Generalized Discrete Approach to Dense Correspondences,” CVPR 2014.
- D. Vu, B. Chidester, H. Yang, M. N. Do, and J. Lu, “Efficient Hybrid Tree-Based Stereo Matching with Applications to Post-Capture Image Refocusing,” IEEE Trans. on Image Processing, 2014.
- K. Zhang, Y. Fang, D. Min, L. Sun, S. Yang, S. Yan, and Q. Tian, “Cross-Scale Cost Aggregation for Stereo Matching,” CVPR 2014.
- W.-Y. Lin, M. Cheng, K. Zheng, J. Lu, and C. Nigel, “Robust Non-parametric Data Fitting for Correspondence Modeling,” ICCV 2013.
- W.-Y. Lin, M. Cheng, J. Lu, H. Yang, M. N. Do, and P. H. S. Torr, “Bilateral Functions for Global Motion Modeling,” ECCV 2014.
- B. Ham, D. Min, C. Oh, M. N. Do, and K. Sohn, “Probability-Based Rendering for View Synthesis,” IEEE Trans. on Image Processing (TIP), 2014.
- W-Y. Lin, L. Lin, Y. Matsushita, K-L. Low, and S. Liu, “Aligning Images in the Wild,” CVPR 2012.
- S. Choi, D. Min, B. Ham, and K. Sohn, “Unsupervised Texture Flow Estimation Using Visual Correspondence,” IEEE Trans. on Image Processing (TIP), 2015.
- J. Lu, K. Shi, D. Min, L. Lin, and M. N. Do, "Cross-based local multipoint filtering," in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition (CVPR), Providence, Rhode Island, June 2012.
- D. Min, S. Choi, J. Lu, B. Ham, K. Sohn, and M. N. Do, “Fast Global Image Smoothing Based on Weighted Least Squares," IEEE Trans. on Image Processing (TIP), vol. 23, no. 12, pp. 5638-5653, Dec. 2014.
- D. Min, J. Lu, and M. N. Do, “Joint Histogram Based Cost Aggregation for Stereo Matching," IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), vol. 35, no. 10, pp. 2539-2545, Oct. 2013.
- S. Choi, D. Min, B. Ham, Y. Kim, C. Oh, and K. Sohn, “Depth Analogy: Data-driven Approach for Single Image Depth Estimation using Gradient Samples,” IEEE Trans. on Image Processing (TIP). (under review)
- D. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. Journal of Computer Vision, 2004.
- S. Leutenegger, et al., “BRISK: Binary robust invariant scalable keypoints,” ICCV 2011.
- M. Calonder, et al., “BRIEF: Computing a local binary descriptor very fast,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2012.
- J. M. Morel and G. Yu, “ASIFT: A new framework for fully affine invariant image comparison,” SIAM Journal on Imaging Sciences, 2009.
- E. Tola, V. Lepetit, and P. Fua, “DAISY: An efficient dense descriptor applied to wide-baseline stereo,” IEEE Trans. Pattern Analysis and Machine Intelligence, 2010.
- P. Weinzaepfel, J. Revaud, Z Harchaoui, and C. Schmid, “DeepFlow: Large displacement optical flow with deep matching,” ICCV 2013.
- E. Schechtman and M. Irani, “Matching local self-similarities across images and videos,” CVPR 2007.
- H. Takeda, S. Farsiu, and P. Milanfar, “Kernel regression for image processing and reconstruction,” IEEE Trans. on Image Processing, 2007.
- R. Zabih and J. Woodfill, “Non-parametric local transforms for computing visual correspondence,” ECCV 1994.
- H. Hirschmuller, “Stereo processing by semi-global matching and mutual information,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2008.
- Y. S. Heo, K. M. Lee, and S. U. Lee, “Robust stereo matching using adaptive normalized cross-correlation,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2011.
- X. Shen, L. Xu, Q. Zhang, and J. Jia, “Multi-modal and multi-spectral registration for natural images,” ECCV 2014.
- H. Hirschmuller and D. Scharstein, “Evaluation of stereo matching costs on images with radiometric differences,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2009.
- C. Vogel, S. Roth, and K. Schindler, “An Evaluation of Data Costs for Optical Flow,” GCPR 2013.
- K.-J. Yoon and I. S. Kweon, “Adaptive support-weight approach for correspondence search,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2006.
- C. Rhemann, A. Hosni, M. Bleyer, C. Rother, and M. Gelautz, “Fast cost-volume filtering for visual correspondence and beyond,” CVPR 2011.
- C. Barnes, E. Shechtman, A. Finkelstein, D. Goldman, “PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing,” SIGGRAPH, 2009.
- C. Barnes, E. Shechtman, D. Goldman, and A. Finkelstein, “The Generalized PatchMatch Correspondence Algorithm,” ECCV 2010.
- L. Bao, Q. Yang, and H. Jin, “Fast Edge-Preserving PatchMatch for Large Displacement Optical Flow,” CVPR 2014.
- Q. Yang, “A Non-Local Cost Aggregation Method for Stereo Matching,” CVPR 2012.
- H. Hirschmuller, “Stereo processing by semi-global matching and mutual information,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2008.
- P. Felzenszwalb and D. Huttenlocher, “Efficient belief propagation for early vision,” Int. Journal of Computer Vision, 2006.
- F. Besse, C. Rother, A. Fitzgibbon, and J. Kautz, “PMBP: Patchmatch belief propagation for correspondence field estimation,” Int. Journal of Computer Vision, 2014.
- C. Liu, J. Yuen, and A. Torralba, “SIFT flow: Dense correspondence across scenes and its applications,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2011.
- P. Krähenbühl and V. Koltun, “Efficient inference in fully connected CRFs with Gaussian edge potentials,” NIPS 2011.
- T. Brox, J. Malik, “Large displacement optical flow: descriptor matching in variational motion estimation,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2011.
- L. Xu, J. Jia, and Y. Matsushita, “Motion detail preserving optical flow estimation,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 2012.
- M. Werlberger, T. Pock, and H. Bischof, “Motion estimation with non-local total variation regularization,” CVPR 2010.
- D. Sun, S. Roth, and M. J. Black, “Secrets of optical flow estimation and their principles,” CVPR 2010.
- Y. HaCohen et al., “NRDC: Non-Rigid Dense Correspondence with Applications for Image Enhancement,” SIGGRAPH 2011.
- C. Liu, J. Yuen and A. Torralba, “Nonparametric Scene Parsing via Label Transfer,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2011.
- K. Karsch, C. Liu and S. B. Kang, “DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2014.
- K. Yamaguchi, D. McAllester and R. Urtasun, “Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation,” ECCV 2014.
- J. Zhang and S. Singh, “Visual-lidar Odometry and Mapping: Low- Drift, Robust, and Fast,” ICRA 2015.
- K. Yamaguchi, D. McAllester, and R. Urtasun, “Robust Monocular Epipolar Flow Estimation,” CVPR 2013.
- A. Geiger, P. Lenz and R. Urtasun, “Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite,” CVPR 2012.
- B. Francesco, A. Alessandro, D. Kurt and C. Alanl, “Advanced High dynamic Range Imaging: theory and practice,” AK Peters/CRC Press, 2011.
- N. K. Kalantari, E. Shechtman, C. Barnes, S. Darabi, D. Goldman, and P. Sen, “Patch-based High Dynamic Range Video,” SIGGRAPH Asia 2013.
- M. Brown and D.G. Lowe, “Automatic Panoramic Image Stitching using Invariant Features,” Int. Journal of Computer Vision, 2006.
- Y. HaCohen et al., “NRDC: Non-Rigid Dense Correspondence with Applications for Image Enhancement,” SIGGRAPH, 2011.
- YiChang Shih, Sylvain Paris, Frédo Durand, William T. Freeman, “Data-driven Hallucination of Different Times of Day from a Single Outdoor Photo,” ACM Transactions on Graphics (SIGGRAPH Asia), 2013.
- J. J. Leonard and H. F. Durrant-whyte, “Simultaneous map building and localization for an autonomous mobile robot,” IROS 91.
- J. Engel, T. Schöps, and D. Cremers, “LSD-SLAM: Large-Scale Direct Monocular SLAM,” ECCV 2014.
- B. Triggs, P. McLauchlan, R. Hartley, A. Fitzgibbon “Bundle Adjustment- A modern Synthesis,” Vision Algorithms: Theory and Partcise, 2000.
- N. Richard, I. Shahram, H. Otmar, M. David, K. David, D. Andrew, K. Pushmeet, S. Jamie and H. Steve and F. Andrew, “KinectFusion: Realtime dense surface mapping and tracking,” ISMAR 2011.
- M. Fischler and R. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Comm. of the ACM, 1981.
Visitor Counter (From 2015.07.13)