NLP研究中的处理和呈现有害文本

论文标题

NLP研究中的处理和呈现有害文本

Handling and Presenting Harmful Text in NLP Research

论文作者

Kirk, Hannah Rose, Birhane, Abeba, Vidgen, Bertie, Derczynski, Leon

论文摘要

文本数据可能构成危害的风险。但是，这些风险尚未完全理解，以及如何以安全的方式处理，出现和讨论有害文本仍然是NLP社区中尚未解决的问题。我们提供了一个分析框架，将三个轴上的危害分类为：（1）危害类型（例如，错误信息，仇恨言论或种族刻板印象）；（2）如果明确研究有害内容（例如，培训仇恨言语分类器），是否遇到危害{寻求}作为研究设计的特征，而不是\ textit {nidoghend}，则在处理无关的问题时遇到了有害内容（例如，语言产生或偏见的语言标签）；（3）从数据中代表的人（MIS）到处理数据的人和数据发布的人。我们为从业者提供建议，并采取具体步骤来减轻研究和出版物的危害。为了协助实施，我们介绍了\ textsc {harmcheck} - 一种用于处理和提出有害文本的文档标准。

Text data can pose a risk of harm. However, the risks are not fully understood, and how to handle, present, and discuss harmful text in a safe way remains an unresolved issue in the NLP community. We provide an analytical framework categorising harms on three axes: (1) the harm type (e.g., misinformation, hate speech or racial stereotypes); (2) whether a harm is \textit{sought} as a feature of the research design if explicitly studying harmful content (e.g., training a hate speech classifier), versus \textit{unsought} if harmful content is encountered when working on unrelated problems (e.g., language generation or part-of-speech tagging); and (3) who it affects, from people (mis)represented in the data to those handling the data and those publishing on the data. We provide advice for practitioners, with concrete steps for mitigating harm in research and in publication. To assist implementation we introduce \textsc{HarmCheck} -- a documentation standard for handling and presenting harmful text in research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题