Embedding Human Knowledge in Deep Neural Network

via Attention Map

Abstract

In this work, we aim to realize a method for embedding human knowledge into deep neural networks. While the conventional method to embed human knowledge has been applied for non-deep machine learning, it is challenging to apply it for deep learning models due to the enormous number of model parameters. To tackle this problem, we focus on the attention mechanism of an attention branch network (ABN). In this paper, we propose a fine-tuning method that utilizes a single-channel attention map which is manually edited by a human expert. Our fine-tuning method can train a network so that the output attention map corresponds to the edited ones. As a result, the fine-tuned network can output an attention map that takes into account human knowledge. Experimental results with ImageNet, CUB-200-2010, and IDRiD demonstrate that it is possible to obtain a clear attention map for a visual explanation and improve the classification performance. Our findings can be a novel framework for optimizing networks through human intuitive editing via a visual interface and suggest new possibilities for human-machine cooperation in addition to the improvement of visual explanations.

Editing of attention map

With ABN, we should be able to adjust the recognition result by editing an attention map, since the attention map is used for the attention mechanism (as mentioned above). We investigate the behavior of ABN in a case where we edit an attention map manually. Specifically, we confirm the changes in classification performance by editing an attention map on the ImageNet dataset.

Adjustment of recognition result by editing an attention map on visual explanation.

Attention editer

Proposed method

The process flow of the proposed method is shown in this figure. First, an ABN model is trained using training samples with labels, and then we collect the attention maps of these samples from the trained model. Here, we only collect the attention maps of misclassified training samples. Second, we edit each of the attention maps based on human knowledge to recognize them correctly. Third, the attention and perception branches of ABN are fine-tuned with the edited attention maps. During the fine-tuning process, we update the parameters of the attention and perception branches by using the loss calculated from the attention map obtained from ABN and the edited attention map in addition to the loss of ABN.

By fine-tuning using a manually edited attention map by a human expert, we can embed human knowledge into the network and obtain an appropriate attention map for better visual explanation. Moreover, by introducing human knowledge to the attention map, classification performance is improved.

Attention map on CUB-200-2010 dataset

Attention map on IDRiD dataset

Bibtex

@article{Mitsuhara2019,
author = {Masahiro, Mitsuhara and Hiroshi, Fukui and Yusuke, Sakashita and Takanori, Ogata and Tsubasa, Hirakawa and Takayoshi, Yamashita and Hironobu Fujiyoshi},
title = {{Embedding Human Knowledge into Deep Neural Network via Attention Map}},
year = {2019}
}