关于利用未标记的数据进行并发的正标分类和强大的生成

论文标题

关于利用未标记的数据进行并发的正标分类和强大的生成

On Leveraging Unlabeled Data for Concurrent Positive-Unlabeled Classification and Robust Generation

论文作者

Yu, Bing, Sun, Ke, Wang, He, Lin, Zhouchen, Zhu, Zhanxing

论文摘要

在许多机器学习问题中，类标签数据的稀缺性是一种普遍存在的瓶颈。尽管通常存在丰富的未标记数据并提供潜在的解决方案，但利用它们是高度挑战的。在本文中，我们通过利用额外未标记的数据\ emph {同时}来解决这个问题。我们提出了一个新颖的培训框架，以通过探索它们之间的相互作用来探索额外数据，尤其是分发未标记的数据时，可以共同针对PU分类和有条件产生：1）在新颖的分类器毫无用处的有条件的gan〜（cni-can）中，nois for nois labels to nois labels a nois y noisy labels，2）potifatife noisy labers，noisy labers deisy labers，2）potifate noisy labers，2）po noisy noisy labers，2）。帮助一代。从理论上讲，我们证明了CNI-CGAN的最佳状况，并且在实验上，我们对各种数据集进行了广泛的评估。

The scarcity of class-labeled data is a ubiquitous bottleneck in many machine learning problems. While abundant unlabeled data typically exist and provide a potential solution, it is highly challenging to exploit them. In this paper, we address this problem by leveraging Positive-Unlabeled~(PU) classification and the conditional generation with extra unlabeled data \emph{simultaneously}. We present a novel training framework to jointly target both PU classification and conditional generation when exposed to extra data, especially out-of-distribution unlabeled data, by exploring the interplay between them: 1) enhancing the performance of PU classifiers with the assistance of a novel Classifier-Noise-Invariant Conditional GAN~(CNI-CGAN) that is robust to noisy labels, 2) leveraging extra data with predicted labels from a PU classifier to help the generation. Theoretically, we prove the optimal condition of CNI-CGAN and experimentally, we conducted extensive evaluations on diverse datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题