论文标题
“这不仅是讨厌'':关于在线检测有害语音的多维观点
"It's Not Just Hate'': A Multi-Dimensional Perspective on Detecting Harmful Speech Online
论文作者
论文摘要
宣布良好的数据是良好自然语言处理模型的先决条件。但是,经常通过优化时间或注释者协议来控制注释决定。我们为在跨学科的环境中为进攻性在线演讲提供了细微努力的理由。检测进攻内容正在迅速成为最重要的现实NLP任务之一。但是,大多数数据集都使用单个二进制标签,例如,即使每个概念都是多方面的,即使是仇恨或不可动剂。这种建模选择严重限制了细微的见解,但也限制了性能。我们表明,一种更细粒度的多标签方法来预测不可思议的和仇恨或不宽容的内容解决概念和绩效问题。我们发布了一个新颖的数据集,其中包含40,000多条有关来自美国和英国移民的推文,并注明了六个标签,以解决不可思议和不宽容的不同方面。我们的数据集不仅允许在线对有害语音的细微差别理解,在基准数据集上接受培训的模型也优于表现或匹配性能。
Well-annotated data is a prerequisite for good Natural Language Processing models. Too often, though, annotation decisions are governed by optimizing time or annotator agreement. We make a case for nuanced efforts in an interdisciplinary setting for annotating offensive online speech. Detecting offensive content is rapidly becoming one of the most important real-world NLP tasks. However, most datasets use a single binary label, e.g., for hate or incivility, even though each concept is multi-faceted. This modeling choice severely limits nuanced insights, but also performance. We show that a more fine-grained multi-label approach to predicting incivility and hateful or intolerant content addresses both conceptual and performance issues. We release a novel dataset of over 40,000 tweets about immigration from the US and UK, annotated with six labels for different aspects of incivility and intolerance. Our dataset not only allows for a more nuanced understanding of harmful speech online, models trained on it also outperform or match performance on benchmark datasets.