使用反事实查询来衡量人群工人的社交偏见

论文标题

使用反事实查询来衡量人群工人的社交偏见

Measuring Social Biases of Crowd Workers using Counterfactual Queries

论文作者

Ghai, Bhavya, Liao, Q. Vera, Zhang, Yunfeng, Mueller, Klaus

论文摘要

基于性别，种族等基于性别，种族等的社会偏见主要通过有偏见的培训数据集污染机器学习（ML）管道。众包是收集标签培训数据集的一种流行的成本效益措施，并不能免疫人群工人的固有社会偏见。为了确保这种社交偏见不会传递到策划的数据集中，重要的是要知道每个人群工人是多么的偏见。在这项工作中，我们提出了一种基于反事实公平性的新方法，以量化每个人群工人中固有的社会偏见程度。可以将这些额外的信息与单个工人的响应一起利用，以策划一个偏见的数据集。

Social biases based on gender, race, etc. have been shown to pollute machine learning (ML) pipeline predominantly via biased training datasets. Crowdsourcing, a popular cost-effective measure to gather labeled training datasets, is not immune to the inherent social biases of crowd workers. To ensure such social biases aren't passed onto the curated datasets, it's important to know how biased each crowd worker is. In this work, we propose a new method based on counterfactual fairness to quantify the degree of inherent social bias in each crowd worker. This extra information can be leveraged together with individual worker responses to curate a less biased dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题