比较神经网络修剪中的倒带和微调

论文标题

比较神经网络修剪中的倒带和微调

Comparing Rewinding and Fine-tuning in Neural Network Pruning

论文作者

Renda, Alex, Frankle, Jonathan, Carbin, Michael

论文摘要

许多神经网络修剪算法分为三个步骤：训练网络完成，删除不需要的结构以压缩网络，并重新训练剩余的结构以恢复失去的精度。标准的再训练技术，微调，使用较小的固定学习率从其最终训练的值中训练未经修复的权重。在本文中，我们将微调与替代性再培训技术进行了比较。重量倒带（正如Frankle等人（2019年）提出的那样，重新倒带的重量从早期的培训中倒入其价值，并使用原始的培训时间表从那里重新训练。学习率的恢复率（我们建议）使用与重量相同的学习率计划来训练未经修复的权重。两种恢复技术的表现都超过了微调，构成了与几种更特定于网络特定网络的最新技术技术的准确性和压缩比相匹配的网络无关修剪算法的基础。

Many neural network pruning algorithms proceed in three steps: train the network to completion, remove unwanted structure to compress the network, and retrain the remaining structure to recover lost accuracy. The standard retraining technique, fine-tuning, trains the unpruned weights from their final trained values using a small fixed learning rate. In this paper, we compare fine-tuning to alternative retraining techniques. Weight rewinding (as proposed by Frankle et al., (2019)), rewinds unpruned weights to their values from earlier in training and retrains them from there using the original training schedule. Learning rate rewinding (which we propose) trains the unpruned weights from their final values using the same learning rate schedule as weight rewinding. Both rewinding techniques outperform fine-tuning, forming the basis of a network-agnostic pruning algorithm that matches the accuracy and compression ratios of several more network-specific state-of-the-art techniques.

下载PDF全文

下载文献需遵守相关版权规定

论文标题