神经网络修剪需要复杂性吗？关于全球幅度修剪的案例研究

论文标题

神经网络修剪需要复杂性吗？关于全球幅度修剪的案例研究

Is Complexity Required for Neural Network Pruning? A Case Study on Global Magnitude Pruning

论文作者

Gupta, Manas, Camci, Efe, Keneta, Vishandi Rudy, Vaidyanathan, Abhishek, Kanodia, Ritwik, Foo, Chuan-Sheng, Min, Wu, Jie, Lin

论文摘要

在过去的十年中，修剪神经网络已经流行，当时证明可以安全地从现代神经网络中安全地删除大量权重而不会损害准确性。此后，已经提出了许多修剪方法，每种方法都声称比以前的艺术要好，但是，以日益复杂的修剪方法为代价。这些方法包括利用重要性得分，通过反向传播获得反馈或在其他等方面具有基于启发式的修剪规则。在这项工作中，我们质疑这种引入复杂性的模式是否真的是为了获得更好的修剪结果的真正必要。我们根据简单的修剪基线（即全球幅度修剪（全球MP））对这些SOTA技术进行基准测试，该基线按其幅度顺序排列权重，并修剪最小的权重。令人惊讶的是，我们发现Vanilla Global MP在SOTA技术上表现出色。在考虑稀疏准确性的权衡时，全球MP在所有稀疏率上的表现都比所有SOTA技术都要好。但是，在考虑FLOPS准确性权衡取舍时，某些SOTA技术的表现优于较低的稀疏性比率，但全球MP在高稀疏性比下开始表现良好，并且在极高的稀疏性比下表现良好。此外，我们发现，许多修剪算法以高稀疏率遇到的一个常见问题，即可以在全球MP中轻松固定层折叠。我们探讨了为什么通过使用一种称为最小阈值的技术，网络中的层崩溃发生在网络中以及如何在全局MP中进行缓解。我们在各种模型（WRN-28-8，Resnet-32，Resnet-50，Mobilenet-V1和FastGrnn）和多个数据集（CIFAR-10，Imagenet和Har-2）上展示了上述发现。代码可在https://github.com/manasgupta-1/globalmp上找到。

Pruning neural networks has become popular in the last decade when it was shown that a large number of weights can be safely removed from modern neural networks without compromising accuracy. Numerous pruning methods have been proposed since, each claiming to be better than prior art, however, at the cost of increasingly complex pruning methodologies. These methodologies include utilizing importance scores, getting feedback through back-propagation or having heuristics-based pruning rules amongst others. In this work, we question whether this pattern of introducing complexity is really necessary to achieve better pruning results. We benchmark these SOTA techniques against a simple pruning baseline, namely, Global Magnitude Pruning (Global MP), that ranks weights in order of their magnitudes and prunes the smallest ones. Surprisingly, we find that vanilla Global MP performs very well against the SOTA techniques. When considering sparsity-accuracy trade-off, Global MP performs better than all SOTA techniques at all sparsity ratios. When considering FLOPs-accuracy trade-off, some SOTA techniques outperform Global MP at lower sparsity ratios, however, Global MP starts performing well at high sparsity ratios and performs very well at extremely high sparsity ratios. Moreover, we find that a common issue that many pruning algorithms run into at high sparsity rates, namely, layer-collapse, can be easily fixed in Global MP. We explore why layer collapse occurs in networks and how it can be mitigated in Global MP by utilizing a technique called Minimum Threshold. We showcase the above findings on various models (WRN-28-8, ResNet-32, ResNet-50, MobileNet-V1 and FastGRNN) and multiple datasets (CIFAR-10, ImageNet and HAR-2). Code is available at https://github.com/manasgupta-1/GlobalMP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题