PhD Research & Code

Robust and Optimal Methods for Geometric Sensor Data Alignment

The main theme of my research is improving methods for geometric sensor data alignment. The problem of geometrically aligning sensor data is the problem of ﬁnding the rigid transformation that correctly aligns two sets of sensor data without any prior knowledge about how the data corresponds. The data may be of diﬀerent dimensionality or captured using diﬀerent sensors but must provide positional or directional information, such as a 3D laser scan or a 2D image.

I investigate two directions in which the optimisation of alignment problems can be improved. The first involves research into data representations and objective functions that are robust to structured outliers and sampling artefacts. Structured outliers are very common, arising whenever the two sets of sensor data were captured from different viewpoints. The second involves research into globally-optimal approaches to alignment that are guaranteed to find the global optimum. This avoids the susceptibility to local optima exhibited by most alignment approaches.

2D–3D Geometric Sensor Data Alignment

Globally-Optimal Spherical Mixture Alignment (GOSMA)

Determining the position and orientation of a calibrated camera from a single image with respect to a 3D model is an essential task for many applications. When 2D-3D correspondences can be obtained reliably, perspective-n-point solvers can be used to recover the camera pose. However, without the pose it is non-trivial to find cross-modality correspondences between 2D images and 3D models, particularly when the latter only contains geometric information. Consequently, the problem becomes one of estimating pose and correspondences jointly. Since outliers and local optima are so prevalent, robust objective functions and global search strategies are desirable. Hence, we cast the problem as a 2D-3D mixture model alignment task and propose the first globally-optimal solution to this formulation under the robust L2 distance between mixture distributions. We derive novel bounds on this objective function and employ branch-and-bound to search the 6D space of camera poses, guaranteeing global optimality without requiring a pose estimate. To accelerate convergence, we integrate local optimization, implement GPU bound computations, and provide an intuitive way to incorporate side information such as semantic labels. The algorithm is evaluated on challenging synthetic and real datasets, outperforming existing approaches and reliably converging to the global optimum.

Figure 1. Spherical mixture alignment for estimating the 6-DoF absolute pose (R, t) of a camera from a single image I, relative to a 3D model (eg point-set P), without 2D-3D correspondences. Our algorithm recovers the transformation by generating mixture distributions from the data --- a von Mises-Fisher Mixture Model (vMFMM) from the image via a bearing vector set F and a Gaussian Mixture Model (GMM) from the 3D model, projected onto the sphere as a quasi-Projected Normal Mixture Model (qPNMM) --- and applying branch-and-bound with tight novel bounds to find R and t that optimally aligns these spherical mixtures.

Code:

The code for Globally-Optimal Spherical Mixture Alignment (GOSMA) is made freely available to the scientific community. Any publications resulting from the use of this code should cite the following paper:

Dylan Campbell, Lars Petersson, Laurent Kneip, Hongdong Li and Stephen Gould, The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, June 2019 [BibTeX]

The code is available here. For further information on the code, please contact dylan.campbell@anu.edu.au.

Globally-Optimal Pose And Correspondences (GOPAC)

Estimating the 6 DoF pose of a camera from a single image relative to a pre-computed 3D point-set is an important task for many computer vision applications. Perspective-n-Point (PnP) solvers are routinely used for camera pose estimation, provided that a good quality set of 2D-3D feature correspondences are known beforehand. However, finding optimal correspondences between 2D key-points and a 3D point-set is non-trivial, especially when only geometric (position) information is known. Existing approaches to the simultaneous pose and correspondence problem use local optimisation, and are therefore unlikely to find the optimal solution without a good pose initialisation, or introduce restrictive assumptions. Since a large proportion of outliers are common for this problem, we instead propose a globally-optimal inlier set cardinality maximisation approach which jointly estimates optimal camera pose and optimal correspondences. Our approach employs branch-and-bound to search the 6D space of camera poses, guaranteeing global optimality without requiring a pose prior. The geometry of SE(3) is used to find novel upper and lower bounds for the number of inliers and local optimisation is integrated to accelerate convergence. The evaluation empirically supports the optimality proof and shows that the method performs much more robustly than existing approaches, including on a large-scale outdoor data-set.

Figure 1. Estimating the pose of a calibrated camera from a single image within a large-scale, unorganised 3D point-set captured by vehicle-mounted laser scanner. Our method solves the absolute pose problem while simultaneously finding feature correspondences, using a globally-optimal branch-and-bound approach with tight novel bounds on the cardinality of the inlier set: (top) 3D point-set (grey and green), 3D features (black dots) and ground-truth (black), RANSAC (red) and our (blue) camera poses; (upper middle) panoramic photograph and extracted 2D features; (lower middle) building points projected onto the image using the RANSAC camera pose; (bottom) building points projected using our camera pose.

campbell2017globally_v10_slides.pdf

Code:

The code for Globally-Optimal Pose And Correspondences (GOPAC) is made freely available to the scientific community. Any publications resulting from the use of this code should cite one of the following papers:

Dylan Campbell, Lars Petersson, Laurent Kneip and Hongdong Li, Globally-Optimal Inlier Set Maximisation for Camera Pose and Correspondence Estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018 [BibTeX]
Dylan Campbell, Lars Petersson, Laurent Kneip and Hongdong Li, Globally-Optimal Inlier Set Maximisation for Simultaneous Camera Pose and Feature Correspondence, IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017 [BibTeX]

Download the code for Globally-Optimal Pose and Correspondences. The gopac zip file contains a README.txt file in the root folder. The license can be viewed in the license.txt file in the root directory. For further information on the code, please contact dylan.campbell@anu.edu.au or lars.petersson@data61.csiro.au.

3D–3D Geometric Sensor Data Alignment

Globally-Optimal Gaussian Mixture Alignment (GOGMA)

Gaussian mixture alignment is a family of approaches that are frequently used for robustly solving the point-set registration problem. However, since they use local optimisation, they are susceptible to local minima and can only guarantee local optimality. Consequently, their accuracy is strongly dependent on the quality of the initialisation. This paper presents the first globally-optimal solution to the 3D rigid Gaussian mixture alignment problem under the L2 distance between mixtures. The algorithm, named GOGMA, employs a branch-and-bound approach to search the space of 3D rigid motions SE(3), guaranteeing global optimality regardless of the initialisation. The geometry of SE(3) was used to find novel upper and lower bounds for the objective function and local optimisation was integrated into the scheme to accelerate convergence without voiding the optimality guarantee. The evaluation empirically supported the optimality proof and showed that the method performed much more robustly on two challenging datasets than an existing globally-optimal registration solution.

Figure 2. Uncertainty region induced by a hypercube C = Cr x Ct. (a) Rotation uncertainty region for Cr. The optimal rotation of x may be anywhere within the heavily-shaded umbrella-shaped uncertainty region, which is entirely contained by the lightly-shaded spherical cap. (b) Translation uncertainty region for Ct. The optimal translation of x may be anywhere within the cube, which is entirely contained by the circumscribed sphere.

Code:

The code for Globally-Optimal Gaussian Mixture Alignment (GOGMA) is made freely available to the scientific community. Any publications resulting from the use of this code should cite the following papers:

Dylan Campbell and Lars Petersson, GOGMA: Globally-Optimal Gaussian Mixture Alignment, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, IEEE, June 2016, pp. 5685-5694. [BibTeX]
Dylan Campbell and Lars Petersson, An Adaptive Data Representation for Robust Point-Set Registration and Merging, IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, IEEE, December 2015, pp. 4292-4300. [BibTeX]

Download the code for Globally-Optimal Gaussian Mixture Alignment. C++ and CUDA code is included. The gogma zip file contains a README.txt file in the root folder. The license can be viewed in the license.txt file in the root directory. For further information on the code, please contact dylan.campbell@data61.csiro.au or lars.petersson@data61.csiro.au.

Support Vector Registration (SVR)

This paper presents a framework for rigid point-set registration and merging using a robust continuous data representation. Our point-set representation is constructed by training a one-class support vector machine with a Gaussian radial basis function kernel and subsequently approximating the output function with a Gaussian mixture model. We leverage the representation’s sparse parametrisation and robustness to noise, outliers and occlusions in an efficient registration algorithm that minimises the L2 distance between our support vector–parametrised Gaussian mixtures. In contrast, existing techniques, such as Iterative Closest Point and Gaussian mixture approaches, manifest a narrower region of convergence and are less robust to occlusions and missing data, as demonstrated in the evaluation on a range of 2D and 3D datasets. Finally, we present a novel algorithm, GMMerge, that parsimoniously and equitably merges aligned mixture models, allowing the framework to be used for reconstruction and mapping.

Figure 3. Robust point-set registration and merging framework. An nD point-set is represented as an SVGM by training a one-class SVM (a) and then mapping it to a GMM (b). The SVR algorithm is used to minimise the L2 distance between two SVGMs in order to align the densities (c). Finally, the GMMerge algorithm is used to parsimoniously fuse the two mixtures. The SVMs are visualised as support vector points scaled by mixture weight and the SVGMs are coloured by probability value.

Code:

The code for Support Vector Registration (SVR) is made freely available to the scientific community. Any publications resulting from the use of this code should cite the following paper:

Dylan Campbell and Lars Petersson, An Adaptive Data Representation for Robust Point-Set Registration and Merging, IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, IEEE, December 2015, pp. 4292-4300. [BibTeX]

Download the code for Support Vector Registration. MATLAB and C++ versions of the code are included. The C++ version is significantly faster, but the MATLAB optimisation package is more powerful and so tends to be more stable and accurate. The svr.zip file contains README files in the distribution/MATLAB/demo and distribution/C++/build folders. The license can be viewed in the license.txt file in the root directory. For further information on the code, please contact dylan.campbell@data61.csiro.au or lars.petersson@data61.csiro.au.