基于差分进化的黑盒对抗样品生成

论文标题

基于差分进化的黑盒对抗样品生成

Black-box Adversarial Sample Generation Based on Differential Evolution

论文作者

Lin, Junyu, Xu, Lei, Liu, Yingqi, Zhang, Xiangyu

论文摘要

深度神经网络（DNN）用于各种日常任务，例如对象检测，语音处理和机器翻译。但是，众所周知，DNNS遭受了鲁棒性问题的困扰 - 被扰动的输入称为对抗样本，导致DNN的行为不端。在本文中，我们提出了一种称为黑盒动量迭代快速梯度符号方法（BMI-FGSM）的黑盒技术，以测试DNN模型的鲁棒性。该技术不需要对目标DNN的结构或权重的任何了解。与需要访问模型内部信息（例如梯度）的现有白盒测试技术相比，我们的技术通过差分进化近似梯度，并使用近似梯度来构建对抗性样本。实验结果表明，我们的技术可以在生成对抗样本以触发错误分类的情况下获得100％的成功，并且在生成样品以触发错误分类到特定目标输出标签方面的成功超过95％。它还显示出更好的扰动距离和更好的可传递性。与最先进的黑盒技术相比，我们的技术更有效。此外，我们对商业Aliyun API进行了测试，并在有限数量的查询中成功触发了其不当行为，这表明了现实世界中Black-Box攻击的可行性。

Deep Neural Networks (DNNs) are being used in various daily tasks such as object detection, speech processing, and machine translation. However, it is known that DNNs suffer from robustness problems -- perturbed inputs called adversarial samples leading to misbehaviors of DNNs. In this paper, we propose a black-box technique called Black-box Momentum Iterative Fast Gradient Sign Method (BMI-FGSM) to test the robustness of DNN models. The technique does not require any knowledge of the structure or weights of the target DNN. Compared to existing white-box testing techniques that require accessing model internal information such as gradients, our technique approximates gradients through Differential Evolution and uses approximated gradients to construct adversarial samples. Experimental results show that our technique can achieve 100% success in generating adversarial samples to trigger misclassification, and over 95% success in generating samples to trigger misclassification to a specific target output label. It also demonstrates better perturbation distance and better transferability. Compared to the state-of-the-art black-box technique, our technique is more efficient. Furthermore, we conduct testing on the commercial Aliyun API and successfully trigger its misbehavior within a limited number of queries, demonstrating the feasibility of real-world black-box attack.

下载PDF全文

下载文献需遵守相关版权规定

论文标题