关于概括在对抗性实例可传递性中的作用

论文标题

关于概括在对抗性实例可传递性中的作用

On the Role of Generalization in Transferability of Adversarial Examples

论文作者

Wang, Yilin, Farnia, Farzan

论文摘要

在过去的几年中，黑盒对抗攻击设计了看不见的神经网络（NNS）的对抗性示例（NNS）受到了极大的关注。尽管文献中已经提出了几种成功的黑盒攻击方案，但推动黑盒对抗示例的转移性的基本因素仍然缺乏透彻的理解。在本文中，我们旨在证明用于在攻击方案转移到未观察到的NN分类器中的替代分类器的概括属性的作用。为此，我们应用了Max-Min对抗示例游戏框架，并显示了替代NN的概括属性在Black-Box攻击方案在应用于不同NN分类器中的成功中的重要性。我们证明了训练和测试样品的攻击可传递速率之间的差异的理论概括范围。我们的界限表明，具有更好概括行为的替代NN可能会导致更容易转移的对抗性例子。此外，我们表明基于标准操作员规范的正则化方法可以提高设计的对抗性示例的可传递性。我们通过执行几个数值实验来支持我们的理论结果，以显示替代网络在生成可转移的对抗示例中的作用。我们的经验结果表明Lipschitz正则化方法在提高对抗性实例的转移性方面的功能。

Black-box adversarial attacks designing adversarial examples for unseen neural networks (NNs) have received great attention over the past years. While several successful black-box attack schemes have been proposed in the literature, the underlying factors driving the transferability of black-box adversarial examples still lack a thorough understanding. In this paper, we aim to demonstrate the role of the generalization properties of the substitute classifier used for generating adversarial examples in the transferability of the attack scheme to unobserved NN classifiers. To do this, we apply the max-min adversarial example game framework and show the importance of the generalization properties of the substitute NN in the success of the black-box attack scheme in application to different NN classifiers. We prove theoretical generalization bounds on the difference between the attack transferability rates on training and test samples. Our bounds suggest that a substitute NN with better generalization behavior could result in more transferable adversarial examples. In addition, we show that standard operator norm-based regularization methods could improve the transferability of the designed adversarial examples. We support our theoretical results by performing several numerical experiments showing the role of the substitute network's generalization in generating transferable adversarial examples. Our empirical results indicate the power of Lipschitz regularization methods in improving the transferability of adversarial examples.

下载PDF全文

下载文献需遵守相关版权规定

论文标题