论文标题
量化和修剪的神经网络的强大错误界限
Robust error bounds for quantised and pruned neural networks
论文作者
论文摘要
随着智能手机和本网络的兴起,数据越来越多地在本地个人设备上产生。出于隐私,延迟和节能的原因,这种转变导致机器学习算法通过存储的数据和算法在设备上进行了数据和算法的朝着分散式化。设备硬件成为此设置中模型能力的主要瓶颈,因此需要减少,更高效的神经网络。神经网络修剪和定量是为此开发的两种方法,两种方法都证明了降低计算成本的令人印象深刻的结果,而无需显着牺牲模型性能。但是,这些还原方法背后的理解仍然不足。为了解决这个问题,引入了一个半明确程序,以限制修剪或量化神经网络引起的最严重案例错误。该方法可以应用于许多神经网络结构和非线性激活函数,其对指定集中的所有输入的界限都具有牢固的范围。希望计算的界限能够确定在安全至关重要系统上时这些算法的性能。
With the rise of smartphones and the internet-of-things, data is increasingly getting generated at the edge on local, personal devices. For privacy, latency and energy saving reasons, this shift is causing machine learning algorithms to move towards decentralisation with the data and algorithms stored, and even trained, locally on devices. The device hardware becomes the main bottleneck for model capability in this set-up, creating a need for slimmed down, more efficient neural networks. Neural network pruning and quantisation are two methods that have been developed for this, with both approaches demonstrating impressive results in reducing the computational cost without sacrificing significantly on model performance. However, the understanding behind these reduction methods remains underdeveloped. To address this issue, a semi-definite program is introduced to bound the worst-case error caused by pruning or quantising a neural network. The method can be applied to many neural network structures and nonlinear activation functions with the bounds holding robustly for all inputs in specified sets. It is hoped that the computed bounds will provide certainty to the performance of these algorithms when deployed on safety-critical systems.