3DV 2024 Tutorial

Camera Geometry Problems in Computer Vision

Monday, March 18th, 2024

Course Information

Target Audience

Academic and industrial researchers on all levels. Given its key role to problems such as 3D reconstruction, autonomous driving, robotics, and augmented reality, camera geometry problems are of importance to a large part of the 3DV audience.

Material

Camera geometry estimation is a crucial tasks in computer vision with many applications, e.g., in structurefrom-motion (SfM) (and by extension, areas relying on SfM as input, e.g., neural radiance fields), visual navigation, augmented and mixed reality, self-driving cars, large-scale 3D reconstruction, and visual localization. Due to the presence of noise and outliers in the input data, e.g., pixel-level correspondences, the predominant way in camera geometry estimation is to use a hypothesis-and-test framework, such as RANSAC. In RANSAC-like methods, two different types of solvers are used: (1) one for fitting a model to a minimal sample, and (2) one for refining a model on (a non-minimal subset of) all inliers, e.g., for final refinement or local optimization. For (1), the main objective is to solve the problem using as few correspondences as possible since the number of RANSAC iterations (and run-time) depends exponentially on the number of correspondences required for model estimation. 

Minimal problems often result in complex systems of polynomial equations in several variables. The introduction of algebraic-based methods, i.e., Grobner basis and resultant-based methods for generating efficient polynomial solvers, into the computer vision community led to solutions to many previously unsolved problems, e.g., the relative and absolute pose problems for cameras with unknown focal length, unknown radial distortion, rolling shutter, and generalized and semi-generalized cameras. In addition, these methods resulted in solvers that can exploit the local image geometry of features, e.g., SIFT or affine features, use lines instead of points, combine lines and points or 2D-2D and 2D-3D matches, or specialize in certain types of camera motion (planar motion, etc.).

While significant progress has been made on minimal and non-minimal solvers over the last decade, current applications surprisingly still rely on classical solvers (e.g., the well-known 5-point-relative-pose and P3P solvers) and do not use more modern ones. E.g., the widely used SfM system COLMAP does not estimate focal length and radial distortion parameters during relative or absolute pose estimation but rather as a post-processing step. As a result, COLMAP (as many other SfM algorithms) struggles in the presence of strong radial distortion. 

The aim of this tutorial is to raise awareness of the tools (solvers and solver generators) that are nowadays at the disposal of 3D vision researchers and practitioners and which problems can and cannot be efficiently solved at the moment. To this end, the tutorial will discuss current state-of-the-art minimal and non-minimal solvers, explain how to implement them and how to use them in practice, as well as give examples of their use in applications. The tutorial has three goals: 1) Provide a comprehensive overview over the current state-of-the-art. At the same time,the tutorial will function as an introduction to the field, e.g., for first- and second-year students. 2) Have experts teach the tricks of the trade to more experienced PhD students and engineers who want to refine their knowledge. 3) Highlight current open problems. This outlines what current algorithms can and cannot do. Throughout the tutorial, we provide links to publicly available source code for the discussed approaches.

Material

Slides will be made available after the tutorial.

Previous Versions of the Tutorial