对深度神经网络的对抗参数攻击

论文标题

对深度神经网络的对抗参数攻击

Adversarial Parameter Attack on Deep Neural Networks

论文作者

Yu, Lijia, Wang, Yihan, Gao, Xiao-Shan

论文摘要

在本文中，提出了对DNN的新参数扰动攻击，称为对抗性参数攻击，其中使对DNN参数的小扰动是使攻击DNN的准确性降低不大，但其稳健性不会降低。对抗性参数攻击比以前的参数扰动攻击更强，因为攻击更难被用户识别，并且攻击的DNN为任何具有很高概率的修改样品输入提供了错误的标签。证明了对抗参数的存在。对于DNN $f_θ$带有参数集$θ$满足某些条件的情况，这表明，如果DNN的深度足够大，则存在一个对抗性参数集$θ_a$ for $θ$，因此$ f_ f_ {θ_A} $的准确性与$f_θ$相等，但要等于$f_θ$，但要等于$ f_θ$ a $f.f_θ$，任何给定的绑定。给出了一种有效的训练算法来计算对抗参数，并使用数值实验来证明算法有效地产生了高质量的对抗参数。

In this paper, a new parameter perturbation attack on DNNs, called adversarial parameter attack, is proposed, in which small perturbations to the parameters of the DNN are made such that the accuracy of the attacked DNN does not decrease much, but its robustness becomes much lower. The adversarial parameter attack is stronger than previous parameter perturbation attacks in that the attack is more difficult to be recognized by users and the attacked DNN gives a wrong label for any modified sample input with high probability. The existence of adversarial parameters is proved. For a DNN $F_Θ$ with the parameter set $Θ$ satisfying certain conditions, it is shown that if the depth of the DNN is sufficiently large, then there exists an adversarial parameter set $Θ_a$ for $Θ$ such that the accuracy of $F_{Θ_a}$ is equal to that of $F_Θ$, but the robustness measure of $F_{Θ_a}$ is smaller than any given bound. An effective training algorithm is given to compute adversarial parameters and numerical experiments are used to demonstrate that the algorithms are effective to produce high quality adversarial parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题