在Semeval-2019 Twitter中，朝着对移民和妇女的仇恨言论进行可解释的多语言检测 - 2019年任务5

论文标题

在Semeval-2019 Twitter中，朝着对移民和妇女的仇恨言论进行可解释的多语言检测 - 2019年任务5

Towards Interpretable Multilingual Detection of Hate Speech against Immigrants and Women in Twitter at SemEval-2019 Task 5

论文作者

Ishmam, Alvi Md

论文摘要

他的论文描述了我们在Twitter上在Twitter上（尤其是英语和西班牙语）在Twitter上检测到针对妇女和移民的仇恨言论的技术。挑战是由Semeval-2019 Task 5设计的，参与者需要设计算法以用给定的目标（例如妇女或移民）以英语和西班牙语来检测仇恨言论。在这里，我们通过利用语言特征来开发了两个深神经网络（双向封盖复发单元（GRU），角色级卷积神经网络（CNN））和一个机器学习模型。我们提出的模型分别获得了英语和西班牙语的任务A的57和75 F1分数。对于任务B，英语的F1分数为67，西班牙语为75.33。在任务A（西班牙）和任务B（英语和西班牙语）的情况下，F1分别提高了2、10和5分。此外，我们提出了可视上的可解释模型，这些模型可以通过研究带注释的数据集来解决定制设计的机器学习体系结构的普遍性问题。

his paper describes our techniques to detect hate speech against women and immigrants on Twitter in multilingual contexts, particularly in English and Spanish. The challenge was designed by SemEval-2019 Task 5, where the participants need to design algorithms to detect hate speech in English and Spanish language with a given target (e.g., women or immigrants). Here, we have developed two deep neural networks (Bidirectional Gated Recurrent Unit (GRU), Character-level Convolutional Neural Network (CNN)), and one machine learning model by exploiting the linguistic features. Our proposed model obtained 57 and 75 F1 scores for Task A in English and Spanish language respectively. For Task B, the F1 scores are 67 for English and 75.33 for Spanish. In the case of task A (Spanish) and task B (both English and Spanish), the F1 scores are improved by 2, 10, and 5 points respectively. Besides, we present visually interpretable models that can address the generalizability issues of the custom-designed machine learning architecture by investigating the annotated dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题