了解大型型号的零拍对对抗性鲁棒性

论文标题

了解大型型号的零拍对对抗性鲁棒性

Understanding Zero-Shot Adversarial Robustness for Large-Scale Models

论文作者

Mao, Chengzhi, Geng, Scott, Yang, Junfeng, Wang, Xin, Vondrick, Carl

论文摘要

经过预处理的大规模视觉模型（如夹子）对看不见的任务表现出强烈的概括。然而，不可察觉的对抗性扰动可以显着降低夹在新任务上的性能。在这项工作中，我们识别并探讨了\ emph {适应零量模型的零量模型}的问题。我们首先在模型适应过程中确定两个关键因素 - 训练损失和适应方法 - 影响模型的零射击对抗鲁棒性。然后，我们提出了一种文本引导的对比对抗训练损失，该损失使文本嵌入和对抗性视觉特征与一小部分训练数据相一致。我们将这种训练损失应用于两种适应方法，即模型列表和视觉及时调整。我们发现，在没有文本的情况下，视觉及时调整更有效，而在文本指导的存在中赢得了冠军的胜利。总体而言，我们的方法显着改善了夹子上的零拍对鲁棒性，平均比影像网和15个零摄像机数据集的平均提高了31点。我们希望这项工作能够阐明大型模型的零拍对对抗性鲁棒性。

Pretrained large-scale vision-language models like CLIP have exhibited strong generalization over unseen tasks. Yet imperceptible adversarial perturbations can significantly reduce CLIP's performance on new tasks. In this work, we identify and explore the problem of \emph{adapting large-scale models for zero-shot adversarial robustness}. We first identify two key factors during model adaption -- training losses and adaptation methods -- that affect the model's zero-shot adversarial robustness. We then propose a text-guided contrastive adversarial training loss, which aligns the text embeddings and the adversarial visual features with contrastive learning on a small set of training data. We apply this training loss to two adaption methods, model finetuning and visual prompt tuning. We find that visual prompt tuning is more effective in the absence of texts, while finetuning wins in the existence of text guidance. Overall, our approach significantly improves the zero-shot adversarial robustness over CLIP, seeing an average improvement of over 31 points over ImageNet and 15 zero-shot datasets. We hope this work can shed light on understanding the zero-shot adversarial robustness of large-scale models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题