致力于社会负责的AI：认知偏见的多目标学习

论文标题

致力于社会负责的AI：认知偏见的多目标学习

Towards Socially Responsible AI: Cognitive Bias-Aware Multi-Objective Learning

论文作者

Sen, Procheta, Ganguly, Debasis

论文摘要

人类社会在认知偏见的痛苦中有着悠久的历史，导致社会偏见和大规模不公正。在大量历史数据中，认知偏见的普遍存在可能构成一种被表现为不道德且看似不人道的预测，就像接受此类数据培训的AI系统的输出一样。为了减轻这个问题，我们提出了一个偏见感知的多目标学习框架，该框架给出了一组身份属性（例如性别，种族等），以及可能的预测输出类别的敏感类别的子集，学会减少预测其某些组合的频率，例如。预测诸如“大多数黑人使用滥用语言”之类的刻板印象，或“恐惧是女性的美德”。我们对通过平衡的阶级先验进行的情感预测任务进行的实验表明，一组基线偏见 - 敏锐的模型对性别表现出认知偏见，例如女性容易害怕，而男人更容易生气。相反，我们提出的偏见多目标学习方法证明了减少预测情绪中的这种偏见。

Human society had a long history of suffering from cognitive biases leading to social prejudices and mass injustice. The prevalent existence of cognitive biases in large volumes of historical data can pose a threat of being manifested as unethical and seemingly inhuman predictions as outputs of AI systems trained on such data. To alleviate this problem, we propose a bias-aware multi-objective learning framework that given a set of identity attributes (e.g. gender, ethnicity etc.) and a subset of sensitive categories of the possible classes of prediction outputs, learns to reduce the frequency of predicting certain combinations of them, e.g. predicting stereotypes such as `most blacks use abusive language', or `fear is a virtue of women'. Our experiments conducted on an emotion prediction task with balanced class priors shows that a set of baseline bias-agnostic models exhibit cognitive biases with respect to gender, such as women are prone to be afraid whereas men are more prone to be angry. In contrast, our proposed bias-aware multi-objective learning methodology is shown to reduce such biases in the predictied emotions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题