论文标题
边界对抗示例反对对抗过度拟合
Boundary Adversarial Examples Against Adversarial Overfitting
论文作者
论文摘要
标准的对抗训练方法的强大过度适合,而在对抗训练时间太长的情况下,稳健的精度会降低。该问题的起源尚不清楚,并且已经报道了相互矛盾的解释,即大损失数据引起的记忆效应,或者是由于较小的损失数据以及随着对抗性培训的进展而导致训练样本的损失分布差异的日益增长。因此,已经提出了几种缓解方法,包括早期停止,时间结合和体重扰动,以减轻强大过度拟合的效果。但是,这些策略的副作用是与标准的对抗训练相比,清洁准确性的降低更大。在本文中,我们研究了这些缓解方法是否在改善对抗性训练表现方面相互互补。我们进一步提出了使用辅助示例的使用,这些示例可以在对抗性示例生成中以最低的成本获得,并显示它们如何提高现有方法中的清洁精度而不会损害稳健的精度。
Standard adversarial training approaches suffer from robust overfitting where the robust accuracy decreases when models are adversarially trained for too long. The origin of this problem is still unclear and conflicting explanations have been reported, i.e., memorization effects induced by large loss data or because of small loss data and growing differences in loss distribution of training samples as the adversarial training progresses. Consequently, several mitigation approaches including early stopping, temporal ensembling and weight perturbations on small loss data have been proposed to mitigate the effect of robust overfitting. However, a side effect of these strategies is a larger reduction in clean accuracy compared to standard adversarial training. In this paper, we investigate if these mitigation approaches are complimentary to each other in improving adversarial training performance. We further propose the use of helper adversarial examples that can be obtained with minimal cost in the adversarial example generation, and show how they increase the clean accuracy in the existing approaches without compromising the robust accuracy.