I am currently working in the Indian Institute of Science at the Supercomputing Education and Research Center (SERC) on optimizing the Sparse Bundle Adjustment algorithm for Tesla architectures with Prof. Vijay Natarajan.
The codes that have been written for this project is available here. If you need to understand them in detail, please refer to this document. All the codes are property of Prof. Manolis Lourakis. Please note the original package has been changed to make specific sections of the code compile for the NVIDIA processor and also remove dependency on external libraries. The Makefiles are changed accordingly. The changes are reflected in the main Makefile and the subsequent Makefiles in each folder. You will need a Linux operating system ( Kernel 2.6.28 or greater). Only specific section of the code has been rewritten to compile on the NVIDIA processor ( involving the A, B, W, S matrices). The jacobian still needs the LAPACK/BLAS libraries, in case your operating system already does not have it.
Here is a brief description of my paper.
Sparse Bundle adjustment is the final step in Multi-view Reconstruction. The initial estimate of camera parameters can be done either using inertial sensors or by using the camera ego motion. You can get more information on this step by consulting this book written by Richard Hartley and Andrew Zisserman.
Once a rough estimate of the 3D co-ordinates is obtained, it remains to improve the accuracy of it. There are several methods available in literature, most of which is based on the minimizing the least mean square error between the actual and reprojected points ( reprojected points are those that are obtained by transposing the estimated 3D co-ordinates to a 2D plane using the camera parameters).
Of these, the Levenberg-Marquardt(LM) Algorithm is the most efficient computationally. Hence we will stick with it for now. ( although it has been claimed that that the Powell’s dog leg approach yields superior results).
The original implementation of the LM can be found here.
However, an detailed analysis of the SBA package reveals that much of the computation is involved in the calculation of the error involves 4 important matrices, namely (W, S, A, B). Therefore it makes sense to reduce the computational time by using the NVIDIA processors.
For particularly large image sequences, these matrices are extremely sparse in nature. This is because, the 3D points are visible only in a few fraction of the hundreds and thousands of the image database. While the method suggested by Prof. Manolis Lourakis reduces the time taken to process the data to over 500x as compared to treating it as a dense matrix, it can still be improved upon using a different approach to store these special matrices. These methods are explored in this paper.
In order to improve it, I have tested the different ways of matrix storage as suggested in this paper. You need the CUDA libraries ( Version 3.0 or greater). I reccomend a linux operating system. A similar attempt at merging large image sequences has been attempted by others. The Building Rome in a Day project at the University of Washington is one such example.