多语言辅助任务培训：弥合语言之间的差距，以进行零拍传输仇恨言论检测模型

论文标题

多语言辅助任务培训：弥合语言之间的差距，以进行零拍传输仇恨言论检测模型

Multilingual Auxiliary Tasks Training: Bridging the Gap between Languages for Zero-Shot Transfer of Hate Speech Detection Models

论文作者

Montariol, Syrielle, Riabi, Arij, Seddah, Djamé

论文摘要

对于涉及许多语言特殊性的任务或语言之间存在文化差距，例如在仇恨言论检测中，零射的跨语性转移学习已被证明是高度挑战的。在本文中，我们强调了使用严格的实验设置在几个领域和语言中进行仇恨言语检测的限制。然后，我们建议培训多语言辅助任务 - 情感分析，命名实体识别以及依靠句法信息的任务 - 改善跨语言的仇恨语音检测模型的零摄像转移。我们展示了仇恨言论检测模型如何受益于通过辅助任务进行微调带来的跨语性知识代理，并强调了这些任务对弥合语言之间仇恨言论语言和文化差距的积极影响。

Zero-shot cross-lingual transfer learning has been shown to be highly challenging for tasks involving a lot of linguistic specificities or when a cultural gap is present between languages, such as in hate speech detection. In this paper, we highlight this limitation for hate speech detection in several domains and languages using strict experimental settings. Then, we propose to train on multilingual auxiliary tasks -- sentiment analysis, named entity recognition, and tasks relying on syntactic information -- to improve zero-shot transfer of hate speech detection models across languages. We show how hate speech detection models benefit from a cross-lingual knowledge proxy brought by auxiliary tasks fine-tuning and highlight these tasks' positive impact on bridging the hate speech linguistic and cultural gap between languages.

下载PDF全文

下载文献需遵守相关版权规定

论文标题