论文标题
关于利用未标记的数据进行并发的正标分类和强大的生成
On Leveraging Unlabeled Data for Concurrent Positive-Unlabeled Classification and Robust Generation
论文作者
论文摘要
在许多机器学习问题中,类标签数据的稀缺性是一种普遍存在的瓶颈。尽管通常存在丰富的未标记数据并提供潜在的解决方案,但利用它们是高度挑战的。在本文中,我们通过利用额外未标记的数据\ emph {同时}来解决这个问题。我们提出了一个新颖的培训框架,以通过探索它们之间的相互作用来探索额外数据,尤其是分发未标记的数据时,可以共同针对PU分类和有条件产生:1)在新颖的分类器毫无用处的有条件的gan〜(cni-can)中,nois for nois labels to nois labels a nois y noisy labels,2)potifatife noisy labers,noisy labers deisy labers,2)potifate noisy labers,2)po noisy noisy labers,2)。帮助一代。从理论上讲,我们证明了CNI-CGAN的最佳状况,并且在实验上,我们对各种数据集进行了广泛的评估。
The scarcity of class-labeled data is a ubiquitous bottleneck in many machine learning problems. While abundant unlabeled data typically exist and provide a potential solution, it is highly challenging to exploit them. In this paper, we address this problem by leveraging Positive-Unlabeled~(PU) classification and the conditional generation with extra unlabeled data \emph{simultaneously}. We present a novel training framework to jointly target both PU classification and conditional generation when exposed to extra data, especially out-of-distribution unlabeled data, by exploring the interplay between them: 1) enhancing the performance of PU classifiers with the assistance of a novel Classifier-Noise-Invariant Conditional GAN~(CNI-CGAN) that is robust to noisy labels, 2) leveraging extra data with predicted labels from a PU classifier to help the generation. Theoretically, we prove the optimal condition of CNI-CGAN and experimentally, we conducted extensive evaluations on diverse datasets.