论文标题

部分可观测时空混沌系统的无模型预测

A methodology to characterize bias and harmful stereotypes in natural language processing in Latin America

论文作者

Alemany, Laura Alonso, Benotti, Luciana, Maina, Hernán, González, Lucía, Rajngewerc, Mariela, Martínez, Lautaro, Sánchez, Jorge, Schilman, Mauro, Ivetta, Guido, Halvorsen, Alexia, Rojo, Amanda Mata, Bordone, Matías, Busaniche, Beatriz

论文摘要

自动决策系统,尤其是基于自然语言处理的系统,在我们的生活中普遍存在。它们不仅在我们每天使用的互联网搜索引擎后面,而且还扮演更关键的角色:选择工作的候选人,确定犯罪嫌疑人,诊断自闭症等。这样的自动化系统犯了错误,这在许多方面可能是有害的,无论是由于后果的严重性(如在健康问题上)还是由于其影响的人数的严重程度。当自动化系统造成的错误比其他系统更大时,我们称该系统为\ textIt {偏见}。 大多数现代的自然语言技术都是基于使用机器学习,即语言模型和单词嵌入的大量文本获得的工件。由于它们是通过应用下符号机器学习(主要是人造神经网络)创建的,因此它们是不透明的,并且实际上无法通过直接检查来解释,从而使它们很难进行审核。 在本文中,我们提出了一种方法,该方法阐明了社会科学家,领域专家和机器学习专家如何在单词嵌入和大语言模型中协作探索偏见和有害的刻板印象。我们的方法基于以下原则: *专注于歧视单词嵌入和语言模型的语言表现,而不是模型的数学属性 *减少歧视专家%的技术障碍,无论是社会科学家,领域专家还是其他 *通过定性探索过程来表征基于指标的方法,而不是以培训的一部分来表征,而不是以此为基础的培训过程。

Automated decision-making systems, especially those based on natural language processing, are pervasive in our lives. They are not only behind the internet search engines we use daily, but also take more critical roles: selecting candidates for a job, determining suspects of a crime, diagnosing autism and more. Such automated systems make errors, which may be harmful in many ways, be it because of the severity of the consequences (as in health issues) or because of the sheer number of people they affect. When errors made by an automated system affect a population more than others, we call the system \textit{biased}. Most modern natural language technologies are based on artifacts obtained from enormous volumes of text using machine learning, namely language models and word embeddings. Since they are created by applying subsymbolic machine learning, mostly artificial neural networks, they are opaque and practically uninterpretable by direct inspection, thus making it very difficult to audit them. In this paper, we present a methodology that spells out how social scientists, domain experts, and machine learning experts can collaboratively explore biases and harmful stereotypes in word embeddings and large language models. Our methodology is based on the following principles: * focus on the linguistic manifestations of discrimination on word embeddings and language models, not on the mathematical properties of the models * reduce the technical barrier for discrimination experts%, be it social scientists, domain experts or other * characterize through a qualitative exploratory process in addition to a metric-based approach * address mitigation as part of the training process, not as an afterthought

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源