Bundled Depth-Map Merging for Multi-View Stereo

Jianguo Li, Eric Li, Yurong Chen, Lin Xu

Intel Labs China


Depth-map merging is one typical technique category for multi-view stereo (MVS) reconstruction. To guarantee accuracy, existing algorithms usually require either sub-pixel level stereo matching precision or continuous depth-map estimation. The merging of inaccurate depth-maps remains a challenging problem. This paper introduces a bundle optimization method for robust and accurate depth-map merging. In the method, depth-maps are generated using DAISY feature, followed by two stages of bundle optimization. The first stage optimizes the track of connected stereo matches to generate initial 3D points. The second stage optimizes the position and normals of 3D points. High quality point cloud is then meshed as geometric models.

The proposed method can be easily parallelizable on multi-core processors. Middlebury evaluation shows that it is one of the most efficient methods among non-GPU algorithms, yet still keeps very high accuracy. We also demonstrate the effectiveness of the proposed algorithm on various real-world, high-resolution, self-calibrated data sets including objects with complex details, objects with large area of highlight, and objects with non-Lambertian surface.


Jianguo Li, Eric Li, Yurong Chen, Lin Xu,  "Bundled depth-map merging for multi-view stereo", 

In CVPR 2010, San Francisco. [pdf] [video-talk]

TechReport (new)

Jianguo Li, Eric Li, Yurong Chen, Lin Xu, "Visual 3D Modeling from Images and Videos",

tech-report, Intel Labs China, 2010. [pdf] [Slides]

Note: This report/slide describes all our work in image/video based 3D modeling.

Algorithm overview


Video supplementary

  [demo videos] (18.6MB)   


  All photos are by Canon IXUS 970 digital camera.

 [Monster] (21MB)  [SculptFace] (31MB) [IronLion] (32MB)
34 photos, 1600×1200 14 photos, 2816×2112 17 photos, 2816×2112

complex surface

about 20cm high

spectula surface

about 1.8m high


2.5m high

The created mesh results for all the 3 datasets [mesh-file](11MB)

The readme file describes the format for camera parameters.


We thank S. Seitz, B. Curless, J. Diebel, D. Scharstein, and R. Szeliski for the setup of the Middlebury evaluation, and D. Scharstein for the great help of the evaluation of our results.

Thanks also go to Intel colleagues Jim Hurley and Horst Haussecker for their supports to our project.

Note Yimin Zhang made no contribution to any of my projects/publications.

Contact: Jianguo Li

Last updated on 07/04/2010