使用上下文化的单词嵌入来检测主观偏见

论文标题

使用上下文化的单词嵌入来检测主观偏见

Towards Detection of Subjective Bias using Contextualized Word Embeddings

论文作者

Dadu, Tanvi, Pant, Kartikey, Mamidi, Radhika

论文摘要

主观偏见检测对于诸如宣传检测，内容建议，情感分析和中和偏见的应用至关重要。这种偏见是通过自然语言通过炎症单词和短语引入的，对事实产生了怀疑，并以真相为前提。在这项工作中，我们在Wiki中立语料库（WNC）上使用基于BERT的模型进行了全面的实验，以检测主观偏见。数据集由$ 360K $标记的实例组成，来自Wikipedia编辑，这些实例消除了偏见的各种实例。我们进一步提出基于Bert的合奏，即以$ 5.6 $ F1分数的价格优于$ BERT_ {lige} $（例如$ bert_ {giald} $）的最先进方法。

Subjective bias detection is critical for applications like propaganda detection, content recommendation, sentiment analysis, and bias neutralization. This bias is introduced in natural language via inflammatory words and phrases, casting doubt over facts, and presupposing the truth. In this work, we perform comprehensive experiments for detecting subjective bias using BERT-based models on the Wiki Neutrality Corpus(WNC). The dataset consists of $360k$ labeled instances, from Wikipedia edits that remove various instances of the bias. We further propose BERT-based ensembles that outperform state-of-the-art methods like $BERT_{large}$ by a margin of $5.6$ F1 score.

下载PDF全文

下载文献需遵守相关版权规定

论文标题