论文标题
从正面和有偏见的负数据分类的偏斜后验概率
Classification from Positive and Biased Negative Data with Skewed Labeled Posterior Probability
论文作者
论文摘要
二进制分类问题的情况仅在其中一个类中观察到偏见的数据。在本文中,我们提出了一种新方法,以了解积极和偏见的负(PBN)分类问题,这是一种弱监督的学习方法,可以从正面数据和负面数据中学习二进制分类器,并具有有偏见的观察结果。我们结合了一种方法来纠正由于置信度偏斜而引起的负面影响,这代表了观察到的数据为正的后验概率。这减少了标记数据的后验概率的失真,这对于PBN分类问题的经验风险最小化是必不可少的。我们通过数值实验和实际数据分析验证了所提出的方法的有效性。
The binary classification problem has a situation where only biased data are observed in one of the classes. In this paper, we propose a new method to approach the positive and biased negative (PbN) classification problem, which is a weakly supervised learning method to learn a binary classifier from positive data and negative data with biased observations. We incorporate a method to correct the negative impact due to skewed confidence, which represents the posterior probability that the observed data are positive. This reduces the distortion of the posterior probability that the data are labeled, which is necessary for the empirical risk minimization of the PbN classification problem. We verified the effectiveness of the proposed method by numerical experiments and real data analysis.