论文标题
与群依赖的标签噪声的公平分类
Fair Classification with Group-Dependent Label Noise
论文作者
论文摘要
这项工作研究了如何在训练标签被随机噪声损坏的设置中训练公平分类器,以及损坏的错误率都取决于标签类别以及受保护子组的成员资格功能。产生注释时,异质标签噪声模型对特定组的系统偏差会对特定组。我们首先提出分析结果,该结果表明,对人口统计学差异度量的天真奇妙限制,而无需考虑异质性和群体依赖性错误率,可以降低所得分类器的准确性和公平性。我们的实验证明了这些问题在实践中也出现了。我们通过使用精心定义的替代损失功能和替代约束的经验风险最小化来解决这些问题,从而避免了异质标签噪声引入的陷阱。我们为方法的功效提供理论和经验理由。我们将结果视为一个重要的例子,说明没有适当护理的有偏见的数据集对公平性的施加公平,至少会造成与其良好的危害。
This work examines how to train fair classifiers in settings where training labels are corrupted with random noise, and where the error rates of corruption depend both on the label class and on the membership function for a protected subgroup. Heterogeneous label noise models systematic biases towards particular groups when generating annotations. We begin by presenting analytical results which show that naively imposing parity constraints on demographic disparity measures, without accounting for heterogeneous and group-dependent error rates, can decrease both the accuracy and the fairness of the resulting classifier. Our experiments demonstrate these issues arise in practice as well. We address these problems by performing empirical risk minimization with carefully defined surrogate loss functions and surrogate constraints that help avoid the pitfalls introduced by heterogeneous label noise. We provide both theoretical and empirical justifications for the efficacy of our methods. We view our results as an important example of how imposing fairness on biased data sets without proper care can do at least as much harm as it does good.