Open-World Instance Segmentation:
Exploiting Pseudo Ground Truth Learned From Pairwise Affinity
Weiyao Wang, Matt Feiszli, Heng Wang, Jitendra Malik, Du Tran
Meta AI Research, UC Berkeley
To Appear at CVPR 2022
Paper Abstract
Open-world instance segmentation is the task of grouping pixels into object instances without any pre-determined taxonomy. This is challenging, as state-of-the-art methods rely on explicit class semantics obtained from large labeled datasets, and out-of-domain evaluation performance drops significantly. Here we propose a novel approach for mask proposals, Generic Grouping Networks (GGNs), constructed without semantic supervision. Our approach combines a local measure of pixel affinity with instance-level mask supervision, producing a training regimen designed to make the model as generic as the data diversity allows. We introduce a method for predicting Pairwise Affinities (PA), a learned local relationship between pairs of pixels. PA generalizes very well to unseen categories. From PA we construct a large set of pseudo-ground-truth instance masks; combined with human-annotated instance masks we train GGNs and significantly outperform the SOTA on open-world instance segmentation on various benchmarks including COCO, LVIS, ADE20K, and UVO.
Our Approach: Generic Grouping Network
Train Pairwise Affinity to predict pixel pairwise affinity relationships: whether they belong to the same instance or not
Use a grouping module to generate pseudo ground truth masks to supplement annotations
Train object detector with a combination of pseudo-GT and annotated-GT
PA Generalizes Well
Pseudo-GT from PA covers a wide range of object categories, including those are not annotated during training of PA.
In the example on the left, despite being trained only on Person, PA generalizes to pot, pan, bird, temple, etc.