论文标题
近似乘法对卷积神经网络的影响
The Effects of Approximate Multiplication on Convolutional Neural Networks
论文作者
论文摘要
本文在进行推断对深卷积神经网络(CNN)时分析了近似乘法的影响。近似乘法可以降低基础电路的成本,从而可以在硬件加速器中更有效地执行CNN推论。该研究确定了卷积,完全连接和批处理归一层层中的关键因素,尽管近似乘法出现了错误,但尽管存在错误的误差,否则可以更准确地进行CNN预测。相同的因素还提供了算术解释,说明为什么Bfloat16乘法在CNN上表现良好。实验是使用公认的网络体系结构进行的,以表明近似乘数可以产生几乎与FP32参考的预测,而无需其他培训。例如,与FP32参考相比,具有Mitch-$ w $ 6乘法的重新网络和Incemnet-V4模型产生的前5个错误在0.2%以内。提出了与BFloat16的简短成本比较,与BFLOAT16相比,与BFLOAT16算术相比,MAC操作节省了多达80%的能源。本文最深远的贡献是分析辩护,即可以在CNN MAC操作中确切地添加乘法。
This paper analyzes the effects of approximate multiplication when performing inferences on deep convolutional neural networks (CNNs). The approximate multiplication can reduce the cost of the underlying circuits so that CNN inferences can be performed more efficiently in hardware accelerators. The study identifies the critical factors in the convolution, fully-connected, and batch normalization layers that allow more accurate CNN predictions despite the errors from approximate multiplication. The same factors also provide an arithmetic explanation of why bfloat16 multiplication performs well on CNNs. The experiments are performed with recognized network architectures to show that the approximate multipliers can produce predictions that are nearly as accurate as the FP32 references, without additional training. For example, the ResNet and Inception-v4 models with Mitch-$w$6 multiplication produces Top-5 errors that are within 0.2% compared to the FP32 references. A brief cost comparison of Mitch-$w$6 against bfloat16 is presented, where a MAC operation saves up to 80% of energy compared to the bfloat16 arithmetic. The most far-reaching contribution of this paper is the analytical justification that multiplications can be approximated while additions need to be exact in CNN MAC operations.