Deep Projective Rotation Estimation
through
Relative Supervision

Brian Okorn*, Chuer Pan*, Martial Hebert, David Held

Robotics Institute, School of Computer Science, Carnegie Mellon University

{bokorn, chuerp, mhebert, dheld}@andrew.cmu.edu

(* indicates equal contribution)

ArXiv / Code (Coming SOON!)

Abstract

Orientation estimation is the core to a variety of vision and robotics tasks such as camera and object pose estimation. Deep learning has offered a way to develop image-based orientation estimators; however, such estimators often require training on a large labeled dataset, which can be time-intensive to collect. In this work, we explore whether self-supervised learning from unlabeled data can be used to alleviate this issue. Specifically, we assume access to estimates of the relative orientation between neighboring poses, such that can be obtained via a local alignment method. While self-supervised learning has been used successfully for translational object keypoints, in this work, we show that naively applying relative supervision to the rotational group SO(3) will often fail to converge due to the non-convexity of the rotational space. To tackle this challenge, we propose a new algorithm for self-supervised orientation estimation which utilizes Modified Rodrigues Parameters to stereographically project the closed manifold of SO(3) to the open manifold of R3, allowing the optimization to be done in an open Euclidean space. We empirically validate the benefits of the proposed algorithm for rotational averaging problem in two settings: (1) direct optimization on rotation parameters, and (2) optimization of parameters of a convolutional neural network that predicts object orientations from images. In both settings, we demonstrate that our proposed algorithm is able to converge to a consistent relative orientation frame much faster than algorithms that purely operate in the SO(3) space.

Motivation

When optimizing on a closed manifold, like SO(3) or the circle below, relative updates can result in deadlocks, where the optimization fails to converge. By opening the space, these deadlocks can be avoided, greatly improving convergence speeds.

Valid Relative Configuration

Intuition

The projection from a closed manifold, like S3, to an open space, like R3, reduces the ambiguity of determining the orientation closest to a set of rotations (average), improving the convergence speed algorithms supervised by these relative measurements. This improvement can be seen in our empirical results and theoretically, in a small intuitive example.

Modified Rodrigues Projection

The Modified Rodrigues Projection (MRP) allows us to supervise our systems using only relative transforms, while avoiding many of the local optima found when only using Riemannian optimization. The relative SO(3) transforms are applied the rotations in the closed manifold, and then projected to the open space, where they can be used to supervise a learned method. The antipodal symmetry is handled by selecting the minimum size vector in the protective space.

Results on Uniformly Sampled Rotations

Our method shows faster and more accurate convergence on 100 random rotations, supervised using only the rotations between each orientation and its three nearest neighbors.