论文标题
通过提高指数家庭的足够统计数据,公平的密度
Fair Densities via Boosting the Sufficient Statistics of Exponential Families
论文作者
论文摘要
我们将增强算法引入了预处理数据以确保公平性。从最初的公平但不准确的分布开始,我们的方法转向更好的数据拟合,同时仍确保最少的公平保证。为此,它学习了具有符合融合的指数级家庭的足够统计数据。重要的是,我们能够从理论上证明学习分布将具有表示率和统计率数据公平性保证。与最近基于优化的预处理方法不同,我们的方法很容易适应连续的域特征。此外,当指定弱学习者为决策树时,可以检查学习分布的足够统计数据,以提供有关(联合国)公平来源的线索。存在经验结果以显示现实数据中结果的质量。
We introduce a boosting algorithm to pre-process data for fairness. Starting from an initial fair but inaccurate distribution, our approach shifts towards better data fitting while still ensuring a minimal fairness guarantee. To do so, it learns the sufficient statistics of an exponential family with boosting-compliant convergence. Importantly, we are able to theoretically prove that the learned distribution will have a representation rate and statistical rate data fairness guarantee. Unlike recent optimization based pre-processing methods, our approach can be easily adapted for continuous domain features. Furthermore, when the weak learners are specified to be decision trees, the sufficient statistics of the learned distribution can be examined to provide clues on sources of (un)fairness. Empirical results are present to display the quality of result on real-world data.