通过降低流层的记忆效率训练

论文标题

通过降低流层的记忆效率训练

Memory-efficient training with streaming dimensionality reduction

论文作者

Huang, Siyuan, Hoskins, Brian D., Daniels, Matthew W., Stiles, Mark D., Adam, Gina C.

论文摘要

在训练深神经网络期间，大量数据的移动给机器学习工作负载带来了巨大的挑战。为了最大程度地减少此开销，尤其是在梯度信息的移动和计算上，我们将流批批处理组件分析作为更新算法。流批次主成分分析使用随机功率迭代来生成网络梯度的随机k-rank近似。我们证明，流批批处理主成分分析产生的低排名更新可以有效地训练各种普通数据集的卷积神经网络，并且性能与标准迷你批次梯度下降相当。这些结果可能会改善应用程序特定的集成电路，以进行深度学习和与数据并行训练的机器学习模型的同步速度。

The movement of large quantities of data during the training of a Deep Neural Network presents immense challenges for machine learning workloads. To minimize this overhead, especially on the movement and calculation of gradient information, we introduce streaming batch principal component analysis as an update algorithm. Streaming batch principal component analysis uses stochastic power iterations to generate a stochastic k-rank approximation of the network gradient. We demonstrate that the low rank updates produced by streaming batch principal component analysis can effectively train convolutional neural networks on a variety of common datasets, with performance comparable to standard mini batch gradient descent. These results can lead to both improvements in the design of application specific integrated circuits for deep learning and in the speed of synchronization of machine learning models trained with data parallelism.

下载PDF全文

下载文献需遵守相关版权规定

论文标题