同一硬币的两个方面：白色框和黑框攻击用于转移学习

论文标题

同一硬币的两个方面：白色框和黑框攻击用于转移学习

Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer Learning

论文作者

Zhang, Yinghua, Song, Yangqiu, Liang, Jian, Bai, Kun, Yang, Qiang

论文摘要

转移学习已成为培训目标域中标记有限的数据的培训深度学习模型的常见实践。另一方面，深层模型容易受到对抗攻击的影响。尽管转移学习已被广泛应用，但其对模型鲁棒性的影响尚不清楚。为了找出这个问题，我们进行了广泛的经验评估，以表明微调有效地增强了白框FGSM攻击下的模型鲁棒性。我们还提出了一种用于传输学习模型的黑框攻击方法，该方法通过其源模型产生的对抗性示例来攻击目标模型。为了系统地衡量白框和黑盒攻击的效果，我们提出了一个新的指标，以评估源模型产生的对抗性示例的转移程度。经验结果表明，在使用微调时，对抗性示例比独立训练两个网络时更容易转移。

Transfer learning has become a common practice for training deep learning models with limited labeled data in a target domain. On the other hand, deep models are vulnerable to adversarial attacks. Though transfer learning has been widely applied, its effect on model robustness is unclear. To figure out this problem, we conduct extensive empirical evaluations to show that fine-tuning effectively enhances model robustness under white-box FGSM attacks. We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model. To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model. Empirical results show that the adversarial examples are more transferable when fine-tuning is used than they are when the two networks are trained independently.

下载PDF全文

下载文献需遵守相关版权规定

论文标题