论文标题
通过扰动和插值从内部实例进行互操作进行良好的微调
Robust Fine-tuning via Perturbation and Interpolation from In-batch Instances
论文作者
论文摘要
在下游任务上进行微调审计的语言模型(PLM)已成为自然语言处理中的普遍做法。但是,大多数PLM都是脆弱的,例如,在对抗性攻击或不平衡数据下它们是脆弱的,这阻碍了PLM在某些下游任务上的应用,尤其是在安全关键的情况下。在本文中,我们提出了一种称为“匹配调节”的简单而有效的微调方法,以迫使PLM变得更强大。对于批处理中的每个实例,我们将同一批次中的其他实例涉及与之交互。具体来说,关于其他标签作为扰动的实例,匹配调整使模型在训练开始时对噪声更加稳健。在接近末端时,比赛调整的重点更多地是在具有相同标签的实例之间进行插值以更好地概括。关于胶水基准中各种任务的广泛实验表明,匹配调节始终优于香草微调的$ 1.64 $得分。此外,对抗攻击和数据失衡的匹配调整表现出显着的鲁棒性。
Fine-tuning pretrained language models (PLMs) on downstream tasks has become common practice in natural language processing. However, most of the PLMs are vulnerable, e.g., they are brittle under adversarial attacks or imbalanced data, which hinders the application of the PLMs on some downstream tasks, especially in safe-critical scenarios. In this paper, we propose a simple yet effective fine-tuning method called Match-Tuning to force the PLMs to be more robust. For each instance in a batch, we involve other instances in the same batch to interact with it. To be specific, regarding the instances with other labels as a perturbation, Match-Tuning makes the model more robust to noise at the beginning of training. While nearing the end, Match-Tuning focuses more on performing an interpolation among the instances with the same label for better generalization. Extensive experiments on various tasks in GLUE benchmark show that Match-Tuning consistently outperforms the vanilla fine-tuning by $1.64$ scores. Moreover, Match-Tuning exhibits remarkable robustness to adversarial attacks and data imbalance.