MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization

Zengyi Qin, Jinglu Wang and Yan Lu

The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI), 2019, Oral


Localizing objects in the real 3D space, which plays a crucial role in scene understanding, is particularly challenging given only a single RGB image due to the geometric information loss during imagery projection. We propose MonoGRNet for the amodal 3D object localization from a monocular RGB image via geometric reasoning in both the observed 2D projection and the unobserved depth dimension. MonoGRNet is a single, unified network composed of four task-specific subnetworks, responsible for 2D object detection, instance depth estimation (IDE), 3D localization and local corner regression.

[PDF] [Code]