调查为什么对比度学习有益于标签噪声的鲁棒性

论文标题

调查为什么对比度学习有益于标签噪声的鲁棒性

Investigating Why Contrastive Learning Benefits Robustness Against Label Noise

论文作者

Xue, Yihao, Whitecross, Kyle, Mirzasoleiman, Baharan

论文摘要

最近已经证明，自我监督的对比学习（CL）非常有效地防止深层网络过度贴合嘈杂的标签。尽管取得了经验成功，但对对比学习对增强鲁棒性的影响的理论理解非常有限。在这项工作中，我们严格地证明，通过对比度学习学到的表示矩阵通过具有以下方式来提高鲁棒性，它具有：（i）与数据中每个子类相对应的一个突出的奇异值，并且剩余的奇异值明显较小；（ii）{{显着的单数矢量与每个子类的干净标签之间的一个很大的对齐。以上属性使对这种表示的线性层能够有效地学习干净的标签，而不会过度适应噪声。}我们进一步表明，通过对比度学习预先训练的深网的雅各比式的低级结构使他们最初可以在噪声标签上进行微调，从而最初取得卓越的性能。最后，我们证明，对比学习提供的最初鲁棒性使鲁棒的训练方法能够在极端噪音水平下实现最新的性能，例如，平均为27.18 \％\％\％和15.58 \ \ \％\ \％cifar-10和Cifar-100的准确性提高，CIFAR-100具有80 \％的对称性噪声标签，以及4.1111 \％\％\％\％\％\％\％\％的webvimens in webs in webs in webs in webs in neb in neb in web in neb in cocceps coccepens in web invevimens in precip coccepens feb new in precip的提高。

Self-supervised Contrastive Learning (CL) has been recently shown to be very effective in preventing deep networks from overfitting noisy labels. Despite its empirical success, the theoretical understanding of the effect of contrastive learning on boosting robustness is very limited. In this work, we rigorously prove that the representation matrix learned by contrastive learning boosts robustness, by having: (i) one prominent singular value corresponding to each sub-class in the data, and significantly smaller remaining singular values; and (ii) {a large alignment between the prominent singular vectors and the clean labels of each sub-class. The above properties enable a linear layer trained on such representations to effectively learn the clean labels without overfitting the noise.} We further show that the low-rank structure of the Jacobian of deep networks pre-trained with contrastive learning allows them to achieve a superior performance initially, when fine-tuned on noisy labels. Finally, we demonstrate that the initial robustness provided by contrastive learning enables robust training methods to achieve state-of-the-art performance under extreme noise levels, e.g., an average of 27.18\% and 15.58\% increase in accuracy on CIFAR-10 and CIFAR-100 with 80\% symmetric noisy labels, and 4.11\% increase in accuracy on WebVision.

下载PDF全文

下载文献需遵守相关版权规定

论文标题