发挥作用：通过视觉解释技术对CNN分类器进行强有力的培训

论文标题

发挥作用：通过视觉解释技术对CNN分类器进行强有力的培训

Playing to distraction: towards a robust training of CNN classifiers through visual explanation techniques

论文作者

Morales, David, Talavera, Estefania, Remeseiro, Beatriz

论文摘要

深度学习领域正在朝着不同的方向发展，仍然需要更有效的培训策略。在这项工作中，我们提出了一种新颖而健壮的培训计划，该计划将视觉解释技术整合在学习过程中。与关注图像相关部分的注意力机制不同，我们旨在通过注意其他区域来提高模型的鲁棒性。从广义上讲，这个想法是要分散学习过程中的分类器，以迫使其不仅要专注于相关区域，而且要专注于那些先验的区域，而这些区域对班级的歧视并不那么有用。我们通过将其嵌入到卷积神经网络的学习过程中，以分析和分类两个著名的数据集，即斯坦福汽车和FGVC-Aircraft的分类，从而测试了提出的方法。此外，我们在实际情况方案中评估了以自我为中心图像的实际情况，使我们能够获得有关人们生活方式的相关信息。特别是，我们从事具有挑战性的Egofoodplaces数据集，以较低的复杂性水平实现了最先进的结果。获得的结果表明我们提出的培训方案对图像分类的适用性，从而提高了最终模型的鲁棒性。

The field of deep learning is evolving in different directions, with still the need for more efficient training strategies. In this work, we present a novel and robust training scheme that integrates visual explanation techniques in the learning process. Unlike the attention mechanisms that focus on the relevant parts of images, we aim to improve the robustness of the model by making it pay attention to other regions as well. Broadly speaking, the idea is to distract the classifier in the learning process to force it to focus not only on relevant regions but also on those that, a priori, are not so informative for the discrimination of the class. We tested the proposed approach by embedding it into the learning process of a convolutional neural network for the analysis and classification of two well-known datasets, namely Stanford cars and FGVC-Aircraft. Furthermore, we evaluated our model on a real-case scenario for the classification of egocentric images, allowing us to obtain relevant information about peoples' lifestyles. In particular, we work on the challenging EgoFoodPlaces dataset, achieving state-of-the-art results with a lower level of complexity. The obtained results indicate the suitability of our proposed training scheme for image classification, improving the robustness of the final model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题