论文标题
深层神经网络的过滤术
Filter Grafting for Deep Neural Networks
论文作者
论文摘要
本文提出了一种称为“滤镜移植的新学习范式”,旨在提高深神经网络(DNNS)的表示能力。动机是DNN具有不重要的(无效)过滤器(例如,L1 Norm接近0)。这些过滤器限制了DNN的潜力,因为它们被确定为对网络的影响很小。在过滤器修剪会去除这些无效的过滤器以考虑效率时,滤网将其从准确的增强角度重新激活它们。通过将外部信息(权重)接种到无效的过滤器中来处理激活。为了更好地执行嫁接过程,我们开发了一个基于熵的标准,以测量过滤器的信息以及一种适应性加权策略,以平衡网络之间的移植信息。在接枝操作之后,与未接触的状态相比,网络几乎没有无效的过滤器,从而使模型具有更大的表示能力。我们还对分类和识别任务进行了广泛的实验,以显示我们方法的优越性。例如,在CIFAR-100数据集上,接枝的MobilenEtV2优于非接枝MobilenetV2的表现约为7%。代码可在https://github.com/fxmeng/filter-grafting.git上找到。
This paper proposes a new learning paradigm called filter grafting, which aims to improve the representation capability of Deep Neural Networks (DNNs). The motivation is that DNNs have unimportant (invalid) filters (e.g., l1 norm close to 0). These filters limit the potential of DNNs since they are identified as having little effect on the network. While filter pruning removes these invalid filters for efficiency consideration, filter grafting re-activates them from an accuracy boosting perspective. The activation is processed by grafting external information (weights) into invalid filters. To better perform the grafting process, we develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks. After the grafting operation, the network has very few invalid filters compared with its untouched state, enpowering the model with more representation capacity. We also perform extensive experiments on the classification and recognition tasks to show the superiority of our method. For example, the grafted MobileNetV2 outperforms the non-grafted MobileNetV2 by about 7 percent on CIFAR-100 dataset. Code is available at https://github.com/fxmeng/filter-grafting.git.