论文标题
FTT-NAS:发现容忍缺陷的卷积神经结构
FTT-NAS: Discovering Fault-Tolerant Convolutional Neural Architecture
论文作者
论文摘要
随着嵌入式深度学习计算系统的快速发展,由深度学习提供动力的应用程序正在从云到边缘移动。在复杂环境下将神经网络(NNS)部署到设备上时,可能存在各种可能的故障:宇宙辐射和放射性杂质引起的软错误,电压不稳定性,老化,温度变化和恶意攻击者。因此,部署NNS的安全风险现在引起了很多关注。在本文中,在分析了各种NN加速器中可能的故障之后,我们从算法的角度正式化并实施了各种故障模型。我们建议容忍故障的神经体系结构搜索(FT-NAS)自动发现当今设备中各种故障可靠的卷积神经网络(CNN)架构。然后,我们将容忍故障训练(FTT)纳入搜索过程中,以获得更好的结果,这称为FTT-NAS。 CIFAR-10上的实验表明,发现的体系结构的表现明显优于其他手动设计的基线体系结构,具有可比或更少的浮点操作(FLOPS)和参数。具体而言,使用相同的故障设置,在功能故障模型下发现的F-FTT-NET的精度为86.2%(Mobilenet-v2实现了68.1%),并且在重量故障模型下发现的W-FTT-NET的精度可实现69.6%的精度为69.6%(VS. 60.8%ACH vish vy Resnet-20)。通过检查发现的架构,我们发现操作原始量,重量量化范围,模型的容量和连接模式对NN模型的故障弹性能力有影响。
With the fast evolvement of embedded deep-learning computing systems, applications powered by deep learning are moving from the cloud to the edge. When deploying neural networks (NNs) onto the devices under complex environments, there are various types of possible faults: soft errors caused by cosmic radiation and radioactive impurities, voltage instability, aging, temperature variations, and malicious attackers. Thus the safety risk of deploying NNs is now drawing much attention. In this paper, after the analysis of the possible faults in various types of NN accelerators, we formalize and implement various fault models from the algorithmic perspective. We propose Fault-Tolerant Neural Architecture Search (FT-NAS) to automatically discover convolutional neural network (CNN) architectures that are reliable to various faults in nowadays devices. Then we incorporate fault-tolerant training (FTT) in the search process to achieve better results, which is referred to as FTT-NAS. Experiments on CIFAR-10 show that the discovered architectures outperform other manually designed baseline architectures significantly, with comparable or fewer floating-point operations (FLOPs) and parameters. Specifically, with the same fault settings, F-FTT-Net discovered under the feature fault model achieves an accuracy of 86.2% (VS. 68.1% achieved by MobileNet-V2), and W-FTT-Net discovered under the weight fault model achieves an accuracy of 69.6% (VS. 60.8% achieved by ResNet-20). By inspecting the discovered architectures, we find that the operation primitives, the weight quantization range, the capacity of the model, and the connection pattern have influences on the fault resilience capability of NN models.