乘法（MNF）：事件驱动的稀疏神经网络加速器

论文标题

乘法（MNF）：事件驱动的稀疏神经网络加速器

Multiply-and-Fire (MNF): An Event-driven Sparse Neural Network Accelerator

论文作者

Yu, Miao, Xiang, Tingting, Miriyala, Venkata Pavan Kumar, Carlson, Trevor E.

论文摘要

机器学习，尤其是深度神经网络推断，已经成为许多计算系统的重要工作量，从数据中心和HPC系统到基于边缘的计算。由于稀疏性的进步有助于提高AI加速度的效率，因此继续需要提高系统效率，以提高高性能和系统级别的加速度。这项工作通过一种事件（或激活驱动）的ANN加速度对稀疏性进行了独特的了解，该方法旨在最大程度地减少无用的工作，提高利用率，提高性能和能源效率。我们的分析和实验结果表明，该事件驱动的解决方案提出了一个新的方向，可以为CNN和MLP工作负载提供高效的AI推断。这项工作证明了最先进的能源效率和以基于激活的稀疏性为中心的性能和高度并行的数据流方法，可改善整体功能单元利用率（30 fps）。这项工作使最先进解决方案的能源效率提高了1.46 $ \ times $。综上所述，这种方法为下一代AI加速平台实现高效，高性能设计提供了一个新颖的新方向。

Machine learning, particularly deep neural network inference, has become a vital workload for many computing systems, from data centers and HPC systems to edge-based computing. As advances in sparsity have helped improve the efficiency of AI acceleration, there is a continued need for improved system efficiency for both high-performance and system-level acceleration. This work takes a unique look at sparsity with an event (or activation-driven) approach to ANN acceleration that aims to minimize useless work, improve utilization, and increase performance and energy efficiency. Our analytical and experimental results show that this event-driven solution presents a new direction to enable highly efficient AI inference for both CNN and MLP workloads. This work demonstrates state-of-the-art energy efficiency and performance centring on activation-based sparsity and a highly-parallel dataflow method that improves the overall functional unit utilization (at 30 fps). This work enhances energy efficiency over a state-of-the-art solution by 1.46$\times$. Taken together, this methodology presents a novel, new direction to achieve high-efficiency, high-performance designs for next-generation AI acceleration platforms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题