离散潜在变量模型的公平推断

论文标题

离散潜在变量模型的公平推断

Fair Inference for Discrete Latent Variable Models

论文作者

Islam, Rashidul, Pan, Shimei, Foulds, James R.

论文摘要

现在已经充分理解的是，经过适当谨慎的数据培训的机器学习模型通常对某些人群表现出不公平和歧视性的行为。传统的算法公平研究主要集中于监督的学习任务，尤其是分类。尽管无监督学习的公平性引起了人们的关注，但文献主要介绍了对连续嵌入的公平表示。在本文中，我们相反，使用具有离散潜在变量的概率图形模型专注于无监督的学习。我们为离散的潜在变量开发了公平的随机变化推理技术，这是通过对旨在尊重交叉性原理的变异分布的公平性惩罚来完成的，这是对法律，社会科学和人文文献的公平性的关键镜头，然后在此惩罚下优化了各种差异参数。我们首先显示了我们方法在基准数据集上使用幼稚的贝叶斯和高斯混合模型来改善公平性和公平性的实用性。为了证明我们的方法的普遍性及其对现实世界影响的潜力，我们然后为刑事司法风险评估开发了一种特殊的图形模型，并使用我们的公平方法来防止推论编码不公平的社会偏见。

It is now well understood that machine learning models, trained on data without due care, often exhibit unfair and discriminatory behavior against certain populations. Traditional algorithmic fairness research has mainly focused on supervised learning tasks, particularly classification. While fairness in unsupervised learning has received some attention, the literature has primarily addressed fair representation learning of continuous embeddings. In this paper, we conversely focus on unsupervised learning using probabilistic graphical models with discrete latent variables. We develop a fair stochastic variational inference technique for the discrete latent variables, which is accomplished by including a fairness penalty on the variational distribution that aims to respect the principles of intersectionality, a critical lens on fairness from the legal, social science, and humanities literature, and then optimizing the variational parameters under this penalty. We first show the utility of our method in improving equity and fairness for clustering using naïve Bayes and Gaussian mixture models on benchmark datasets. To demonstrate the generality of our approach and its potential for real-world impact, we then develop a special-purpose graphical model for criminal justice risk assessments, and use our fairness approach to prevent the inferences from encoding unfair societal biases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题