Benjamin Biggs [1], Oliver Boyne [1], James Charles [1], Andrew Fitzgibbon [2], Roberto Cipolla [1]
[1] University of Cambridge, [2] Microsoft
We introduce an automatic, end-to-end method for recovering the 3D pose and shape of dogs from monocular internet images. The large variation in shape between dog breeds, signicant occlusion and low quality of internet images makes this a challenging problem. We learn a richer prior over shapes than previous work, which helps regularize parameter estimation. We demonstrate results on the Stanford Dog Dataset, an 'in-the-wild' dataset of 20,580 dog images for which we have collected 2D joint and silhouette annotations to split for training and evaluation. In order to capture the large shape variety of dogs, we show that the natural variation in the 2D dataset is enough to learn a detailed 3D prior through expectation maximization (EM). As a byproduct of training, we generate a new parameterized model (including limb scaling) SMBLD which we release alongside our new annotation dataset StanfordExtra to the research community.
Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.
2D keypoint and segmentation annotations for 12,000 images of 120 dog breeds in various poses.
@inproceedings{biggs2020wldo,
title={{W}ho left the dogs out: {3D} animal reconstruction with expectation maximization in the loop},
author={Biggs, Benjamin and Boyne, Ollie and Charles, James and Fitzgibbon, Andrew and Cipolla, Roberto},
booktitle={ECCV},
year={2020}
}
The authors would like to thank the GSK AI team for providing access to their GPU cluster, Michael Sutcliffe, Thomas Roddick, Matthew Allen and Peter Fisher for useful technical discussions, and the GSK TDI group for project sponsorship.
Please do get in touch with us if you are interested in this work, or you have any questions: Benjamin Biggs