混合如何帮助鲁棒性和概括？

论文标题

混合如何帮助鲁棒性和概括？

How Does Mixup Help With Robustness and Generalization?

论文作者

Zhang, Linjun, Deng, Zhun, Kawaguchi, Kenji, Ghorbani, Amirata, Zou, James

论文摘要

混合是一种流行的数据增强技术，基于采用成对示例及其标签的凸组合。这项简单的技术已显示出可以大大改善训练有素模型的鲁棒性和概括。但是，这种改进会发生的原因并不理解。在本文中，我们提供了理论分析，以证明在训练中使用混音如何有助于建模鲁棒性和泛化。为了鲁棒性，我们表明，将混合损失最小化对应于将对抗损失的上限最小化。这就解释了为什么通过混合训练获得的模型表现出对几种对抗性攻击（例如快速梯度标志方法（FGSM））的鲁棒性。为了进行概括，我们证明了混合增强对应于一种特定类型的数据自适应正则化，从而减少了过度拟合。我们的分析提供了新的见解和一个了解混合的框架。

Mixup is a popular data augmentation technique based on taking convex combinations of pairs of examples and their labels. This simple technique has been shown to substantially improve both the robustness and the generalization of the trained model. However, it is not well-understood why such improvement occurs. In this paper, we provide theoretical analysis to demonstrate how using Mixup in training helps model robustness and generalization. For robustness, we show that minimizing the Mixup loss corresponds to approximately minimizing an upper bound of the adversarial loss. This explains why models obtained by Mixup training exhibits robustness to several kinds of adversarial attacks such as Fast Gradient Sign Method (FGSM). For generalization, we prove that Mixup augmentation corresponds to a specific type of data-adaptive regularization which reduces overfitting. Our analysis provides new insights and a framework to understand Mixup.

下载PDF全文

下载文献需遵守相关版权规定

论文标题