广泛的卷积神经网络的多移民通道体系结构

论文标题

广泛的卷积神经网络的多移民通道体系结构

Multigrid-in-Channels Architectures for Wide Convolutional Neural Networks

论文作者

Ephrath, Jonathan, Ruthotto, Lars, Treister, Eran

论文摘要

我们提出了一种跨越方法，该方法与标准卷积神经网络（CNN）中的通道数相对于参数数量的二次增长。已经表明，标准CNN有冗余，因为具有更稀疏的卷积运算符的网络可以产生与完整网络相似的性能。但是，导致这种行为的稀疏模式通常是随机的，会阻碍硬件效率。在这项工作中，我们提出了一种用于构建CNN体系结构的多机通道方法，该方法可实现通道的完整耦合，并且其参数的数量与网络宽度成正比。为此，我们用一个由结构化（即分组）卷积组成的多级层代替了通用CNN中的每个卷积层。我们来自监督图像分类的示例表明，将此策略应用于剩余网络和MobilenETV2大大减少了参数的数量，而不会对准确性产生负面影响。因此，我们可以扩大网络而不会大大增加参数或操作的数量。

We present a multigrid approach that combats the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs). It has been shown that there is a redundancy in standard CNNs, as networks with much sparser convolution operators can yield similar performance to full networks. The sparsity patterns that lead to such behavior, however, are typically random, hampering hardware efficiency. In this work, we present a multigrid-in-channels approach for building CNN architectures that achieves full coupling of the channels, and whose number of parameters is linearly proportional to the width of the network. To this end, we replace each convolution layer in a generic CNN with a multilevel layer consisting of structured (i.e., grouped) convolutions. Our examples from supervised image classification show that applying this strategy to residual networks and MobileNetV2 considerably reduces the number of parameters without negatively affecting accuracy. Therefore, we can widen networks without dramatically increasing the number of parameters or operations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题