深层神经网络中对抗性例子的早期转移性

论文标题

深层神经网络中对抗性例子的早期转移性

Early Transferability of Adversarial Examples in Deep Neural Networks

论文作者

BenShmuel, Oriel

论文摘要

本文将描述和分析以前未知的新现象，我们称之为“早期可转让性”。它的本质是，即使在培训的极早阶段，不同网络之间的对抗性扰动转移也是如此。实际上，一个人可以初始化两个具有两个不同独立重量选择的网络，并在训练的每个步骤后测量其对抗扰动之间的角度。我们发现的是，尽管这两个训练步骤（通常仅使用可用培训数据的一小部分），这两个对抗方向已经开始彼此保持一致，即使由于培训的早期阶段，两个网络的准确性并没有从最初的不良值开始提高。本文的目的是在实验上介绍这一现象，并提出有关其某些特性的合理解释。

This paper will describe and analyze a new phenomenon that was not known before, which we call "Early Transferability". Its essence is that the adversarial perturbations transfer among different networks even at extremely early stages in their training. In fact, one can initialize two networks with two different independent choices of random weights and measure the angle between their adversarial perturbations after each step of the training. What we discovered was that these two adversarial directions started to align with each other already after the first few training steps (which typically use only a small fraction of the available training data), even though the accuracy of the two networks hadn't started to improve from their initial bad values due to the early stage of the training. The purpose of this paper is to present this phenomenon experimentally and propose plausible explanations for some of its properties.

下载PDF全文

下载文献需遵守相关版权规定

论文标题