论文标题
通过数据修剪进行有效的对抗培训
Efficient Adversarial Training With Data Pruning
论文作者
论文摘要
神经网络容易受到对抗性示例的影响,可导致模型失败的扰动。对抗训练是阻止对抗性例子的解决方案之一。模型在训练过程中受到攻击,并学会对其进行弹性。然而,这样的过程目前很昂贵 - 它需要很长时间才能用对抗性样本生产和训练模型,而且更糟糕的是,偶尔会失败。在本文中,我们证明了通过数据子采样来提高对抗性训练效率的数据修剪方法。我们从经验上表明,数据修剪会改善对抗训练的收敛性和可靠性,尽管公用事业水平不同。例如,我们观察到,使用CIFAR10的随机亚采样删除40%的数据,我们对最强大的攻击者失去了8%的对手精度,而通过仅使用20%的数据,我们损失了14%的对手准确性,并将运行时间降低了3倍。
Neural networks are susceptible to adversarial examples-small input perturbations that cause models to fail. Adversarial training is one of the solutions that stops adversarial examples; models are exposed to attacks during training and learn to be resilient to them. Yet, such a procedure is currently expensive-it takes a long time to produce and train models with adversarial samples, and, what is worse, it occasionally fails. In this paper we demonstrate data pruning-a method for increasing adversarial training efficiency through data sub-sampling.We empirically show that data pruning leads to improvements in convergence and reliability of adversarial training, albeit with different levels of utility degradation. For example, we observe that using random sub-sampling of CIFAR10 to drop 40% of data, we lose 8% adversarial accuracy against the strongest attackers, while by using only 20% of data we lose 14% adversarial accuracy and reduce runtime by a factor of 3. Interestingly, we discover that in some settings data pruning brings benefits from both worlds-it both improves adversarial accuracy and training time.