Learning Consensus Representation for Weak Style Classification
Shuhui Jiang1 , Ming Shao 3, Chengcheng Jia 1 and Yun Fu1,2
1Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 02115, USA
2College of Computer and Information Science, Northeastern University, Boston, MA 02115, USA
3Department of Computer Information Science, University of Massachusetts Dartmouth, Dartmouth, MA 02747 USA
{shjiang,yunfu}@ece.neu.edu, mshao@umassd.edu, jiachengcheng128@gmail.com
abstract
Style classification (e.g., Baroque and Gothic architecture style) is grabbing increasing attention in many fields such as fashion, architecture, and manga. Most existing methods focus on extracting discriminative features from local patches or patterns. However, the spread out phenomenon in style classification has not been recognized yet. It means that visually less representative images in a style class are usually very diverse and easily getting misclassified. We name them weak style images. Another issue when employing multiple visual features towards effective weak style classification is lack of consensus among different features. That is, weights for different visual features in the local patch should have been allocated similar values. To address these issues, we propose a Consensus Style Centralizing Auto-Encoder (CSCAE) for learning robust style features representation, especially for weak style classification. First, we propose a Style Centralizing Auto-Encoder (SCAE) which centralizes weak style features in a progressive way. Then, based on SCAE, we propose both the non-linear and linear version CSCAE which adaptively allocate weights for different features during the progressive centralization process. Consensus constraints are added based on the assumption that the weights of different features of the same patch should be similar. Specifically, the proposed linear counterpart of CSCAE motivated by the ``shared weights'' idea as well as group sparsity improves both efficacy and efficiency. For evaluations, we experiment extensively on fashion, manga and architecture style classification problems. In addition, we collect a new dataset---Online Shopping, for fashion style classification, which will be publicly available for vision based fashion style research. Experiments demonstrate the effectiveness of the SCAE and CSCAE on both public and newly collected datasets when compared with the most recent state-of-the-art works.
Fashion Style Classification
Visualization of Fashion Style Classification Results
Figure: Visualization of the correct (left part) and incorrect (right) classification results on Online Shopping dataset under different style levels $\phi$. Below images, the correct category labels (ground truth) are marked in blue (first row) and the incorrect are marked in red (second line). From both the correct and incorrect examples, we could see when $\phi$ decreases, it becomes difficult to distinguish a fashion style by human. For example, in the correct classification results, it is easier to recognize the ``Avant'' style in $\phi$=6 (higher style level), but more difficult to recognize the ``Avant'' style in $\phi$=2. For the incorrect examples, we could see that although the estimated results are not the same with the ground truth, they are still reasonable upon common sense. For example, in the first column in incorrect classification in $\phi$ = 6, the ground truth is ``Splendid'', and our estimation is ``Modern''. The red box on the image means the estimated result is neither the same as the ground truth, nor acceptable by the human sense.