Quanshi Zhang, Ruiming Cao, Ying Nian Wu, and Song-Chun Zhu in AAAI 2017
Download the paper from here.
This paper proposes a learning strategy that extracts object-part concepts from a pre-trained convolutional neural network (CNN), in an attempt to 1) explore explicit semantics hidden in CNN units and 2) gradually grow a semantically interpretable graphical model on the pre-trained CNN for hierarchical object understanding. Given part annotations on very few (e.g., 3—12) objects, our method mines certain latent patterns from the pre-trained CNN and associates them with different semantic parts. We use a four-layer And-Or graph to organize the mined latent patterns, so as to clarify their internal semantic hierarchy. Our method is guided by a small number of part annotations, and it achieves superior performance (about 13%—107% improvement) in part center prediction on the PASCAL VOC and ImageNet datasets.
Table 1, Part center prediction accuracy of 3-shot learning on the ILSVRC 2013 DET Animal-Part dataset.
Table 3, Part center prediction accuracy of 3-shot learning on the CUB200-2011 dataset.
Figure 3, Image reconstruction based on the mined latent patterns in the AOG (for pattern visualization)
Figure 4, Heat map of CNN units within a part template. We sum up the CNN units, which are associated by the AOG throughout all the conv-slices at the 5th conv-layers to generate the heat map.
Figure 5, Localization of semantic parts based on the AOG
Please contact Dr. Quanshi Zhang (https://sites.google.com/site/quanshizhang/), when you have questions.