人群来源平台上的机器学习模型是否表现出偏见？关于模型公平的实证研究

论文标题

人群来源平台上的机器学习模型是否表现出偏见？关于模型公平的实证研究

Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness

论文作者

Biswas, Sumon, Rajan, Hridesh

论文摘要

机器学习模型越来越多地用于重要的决策软件中，例如批准银行贷款，推荐刑事判决，雇用员工等。重要的是要确保这些模型的公平性，以便在决策时没有根据受保护的属性（例如种族，性别，年龄）进行歧视。已经开发出算法来衡量不公平并在一定程度上减轻它们。在本文中，我们专注于对现实世界机器学习模型公平性和缓解的经验评估。我们已经创建了用于5个不同任务的Kaggle的40个顶级模型的基准，然后使用一组全面的公平指标，评估了它们的公平性。然后，我们在这些模型上应用了7种缓解技术，并分析了公平性，缓解结果和对性能的影响。我们发现，某些模型优化技术导致模型中引起不公平性。另一方面，尽管机器学习库中存在一些公平控制机制，但尚未记录下来。缓解算法也表现出常见的模式，例如后处理中的缓解措施通常是昂贵的（在性能方面），在大多数情况下，在预处理阶段缓解措施是优选的。我们还提出了不同的平等缓解决定的权衡选择。我们的研究提出了未来的研究指示，以减少理论公平意识算法与软件工程方法之间的差距，以在实践中利用它们。

Machine learning models are increasingly being used in important decision-making software such as approving bank loans, recommending criminal sentencing, hiring employees, and so on. It is important to ensure the fairness of these models so that no discrimination is made based on protected attribute (e.g., race, sex, age) while decision making. Algorithms have been developed to measure unfairness and mitigate them to a certain extent. In this paper, we have focused on the empirical evaluation of fairness and mitigations on real-world machine learning models. We have created a benchmark of 40 top-rated models from Kaggle used for 5 different tasks, and then using a comprehensive set of fairness metrics, evaluated their fairness. Then, we have applied 7 mitigation techniques on these models and analyzed the fairness, mitigation results, and impacts on performance. We have found that some model optimization techniques result in inducing unfairness in the models. On the other hand, although there are some fairness control mechanisms in machine learning libraries, they are not documented. The mitigation algorithm also exhibit common patterns such as mitigation in the post-processing is often costly (in terms of performance) and mitigation in the pre-processing stage is preferred in most cases. We have also presented different trade-off choices of fairness mitigation decisions. Our study suggests future research directions to reduce the gap between theoretical fairness aware algorithms and the software engineering methods to leverage them in practice.

下载PDF全文

下载文献需遵守相关版权规定

论文标题