论文标题

卷积神经网络的通信范围

Communication Bounds for Convolutional Neural Networks

论文作者

Chen, Anthony, Demmel, James, Dinh, Grace, Haberle, Mason, Holtz, Olga

论文摘要

卷积神经网络(CNN)在各种机器学习任务和应用中很重要,因此优化其性能至关重要。在内存层次结构级别或网络上的处理器之间移动数据词比算术成本要贵得多,因此最小化通信对于优化性能至关重要。在本文中,我们在单处理器和平行分布式内存模型中介绍了混合精度卷积的数据运动的新范围,以及优于IM2COL等电流实现的算法。我们使用机器学习加速器Gemmini获得了性能数字,在该供应商提供的算法中,我们的平铺可在13%至150%之间进行改进。

Convolutional neural networks (CNNs) are important in a wide variety of machine learning tasks and applications, so optimizing their performance is essential. Moving words of data between levels of a memory hierarchy or between processors on a network is much more expensive than the cost of arithmetic, so minimizing communication is critical to optimizing performance. In this paper, we present new lower bounds on data movement for mixed precision convolutions in both single-processor and parallel distributed memory models, as well as algorithms that outperform current implementations such as Im2Col. We obtain performance figures using GEMMINI, a machine learning accelerator, where our tiling provides improvements between 13% and 150% over a vendor supplied algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源