Weakly Supervised Object Localization with Progressive Domain Adaptation
CVPR 2016


We address the problem of weakly supervised object localization where only image-level annotations are available for training. Many existing approaches tackle this problem through object proposal mining. However, a substantial amount of noise in object proposals causes ambiguities for learning discriminative object models. Such approaches are sensitive to model initialization and often converge to an undesirable local minimum. In this paper, we address this problem by progressive domain adaptation with two main steps: classification adaptation and detection adaptation. In classification adaptation, we transfer a pre-trained network to our multi-label classification task for recognizing the presence of a certain object in an image. In detection adaptation, we first use a mask-out strategy to collect class-specific object proposals and apply multiple instance learning to mine confident candidates. We then use these selected object proposals to fine-tune all the layers, resulting in a fully adapted detection network. We extensively evaluate the localization performance on the PASCAL VOC and ILSVRC datasets and demonstrate significant performance improvement over the state-of-the-art methods.


Code and models

Source code: [Github page].

AlexNet-based classification adaptation model: [caffemodel][prototxt]
AlexNet-based detection adaptation model: [caffemodel][prototxt]

VGGNet-based classification adaptation model: [caffemodel][prototxt]
VGGNet-based detection adaptation model: [caffemodel][prototxt]


    author = {Li, Dong and Huang, Jia-Bin and Li, Yali and Wang, Shengjin and Yang, Ming-Hsuan},
    title = {Weakly Supervised Object Localization with Progressive Domain Adaptation},
    booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
    year = {2016},
    volume = {},
    number = {},
    pages = {}