Gedi：自我监督学习的生成和歧视性培训

论文标题

Gedi：自我监督学习的生成和歧视性培训

GEDI: GEnerative and DIscriminative Training for Self-Supervised Learning

论文作者

Sansone, Emanuele, Manhaeve, Robin

论文摘要

自我监督学习是一种流行而有力的方法，用于利用大量未标记的数据，文献中已经提出了各种各样的培训目标。在这项研究中，我们对最先进的自学学习目标进行了贝叶斯分析，并提出了基于可能性学习的统一配方。我们的分析提出了一种简单的方法，可以将自我监督的学习与生成模型整合在一起，从而允许对这两种看似不同的方法进行联合培训。我们将此组合的框架称为GEDI，它代表生成和歧视性培训。此外，我们通过将基于能量的模型与基于群集的自我监督学习模型集成到GEDI框架的实例化。通过对包括SVHN，CIFAR10和CIFAR100在内的合成和现实世界数据的实验，我们表明GEDI在广泛的余量方面优于现有的自我监督学习策略。我们还证明，GEDI可以集成到神经符号框架中，以解决小型数据制度中的任务，在该框架中，它可以使用逻辑约束来进一步改善聚类和分类性能。

Self-supervised learning is a popular and powerful method for utilizing large amounts of unlabeled data, for which a wide variety of training objectives have been proposed in the literature. In this study, we perform a Bayesian analysis of state-of-the-art self-supervised learning objectives and propose a unified formulation based on likelihood learning. Our analysis suggests a simple method for integrating self-supervised learning with generative models, allowing for the joint training of these two seemingly distinct approaches. We refer to this combined framework as GEDI, which stands for GEnerative and DIscriminative training. Additionally, we demonstrate an instantiation of the GEDI framework by integrating an energy-based model with a cluster-based self-supervised learning model. Through experiments on synthetic and real-world data, including SVHN, CIFAR10, and CIFAR100, we show that GEDI outperforms existing self-supervised learning strategies in terms of clustering performance by a wide margin. We also demonstrate that GEDI can be integrated into a neural-symbolic framework to address tasks in the small data regime, where it can use logical constraints to further improve clustering and classification performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题