一种新型的全球空间注意机制，用于医学图像分类的卷积神经网络

论文标题

一种新型的全球空间注意机制，用于医学图像分类的卷积神经网络

A Novel Global Spatial Attention Mechanism in Convolutional Neural Network for Medical Image Classification

论文作者

Xu, Linchuan, Huang, Jun, Nitanda, Atsushi, Asaoka, Ryo, Yamanishi, Kenji

论文摘要

已将空间注意力引入了卷积神经网络（CNN），以提高其在包括图像分类在内的视觉任务中的性能和可解释性。空间注意力的本质是学习一个体重图，代表同一层或通道内激活的相对重要性。从重量图特定于图像的意义上讲，所有现有的注意机制都是本地注意事项。但是，在医疗领域，在某些情况下，所有图像都应共享相同的重量图，因为一组图像记录了与同一对象相关的相同症状，从而共享相同的结构内容。因此，在本文中，我们提出了一种新型的CNN中全球空间注意机制，主要用于医学图像分类。全球重量图是由重要像素和不重要像素之间的决策边界实例化的。我们建议通过二进制分类器实现决策边界，其中像素上所有图像的强度是像素的特征。将二进制分类集成到图像分类CNN中，并将与CNN一起优化。在两个医疗图像数据集和一个面部表达数据集上进行的实验表明，有了提出的关注，不仅可以改善Googlenet，VGG，Resnet和Densenet的四个强大的CNN的性能，而且可以得到有意义的参与区域，而且可以获得有意义的参与区域，这对理解域图像的内容有益。

Spatial attention has been introduced to convolutional neural networks (CNNs) for improving both their performance and interpretability in visual tasks including image classification. The essence of the spatial attention is to learn a weight map which represents the relative importance of activations within the same layer or channel. All existing attention mechanisms are local attentions in the sense that weight maps are image-specific. However, in the medical field, there are cases that all the images should share the same weight map because the set of images record the same kind of symptom related to the same object and thereby share the same structural content. In this paper, we thus propose a novel global spatial attention mechanism in CNNs mainly for medical image classification. The global weight map is instantiated by a decision boundary between important pixels and unimportant pixels. And we propose to realize the decision boundary by a binary classifier in which the intensities of all images at a pixel are the features of the pixel. The binary classification is integrated into an image classification CNN and is to be optimized together with the CNN. Experiments on two medical image datasets and one facial expression dataset showed that with the proposed attention, not only the performance of four powerful CNNs which are GoogleNet, VGG, ResNet, and DenseNet can be improved, but also meaningful attended regions can be obtained, which is beneficial for understanding the content of images of a domain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题