通过有监督的对比学习

论文标题

通过有监督的对比学习

Cross-Platform and Cross-Domain Abusive Language Detection with Supervised Contrastive Learning

论文作者

Khondaker, Md Tawkat Islam, Abdul-Mageed, Muhammad, Lakshmanan, Laks V. S.

论文摘要

在不同的在线平台上虐待语言的普遍性一直是一个主要问题，它增加了对自动跨平台虐待语言检测的需求。但是，先前的工作着重于来自多个平台的串联数据，固有地采用了经验风险最小化（ERM）方法。在这项工作中，我们从域概括目标的角度解决了这一挑战。我们设计了SCL-FISH，这是一种受监督的对比学习集成的元学习算法，可在看不见的平台上检测滥用语言。我们的实验分析表明，SCL-FISH比ERM和现有的最新模型实现了更好的性能。我们还表明，SCL-FISH是数据效率的，并且在针对滥用语言检测任务的FINETUNEN时，与大规模的预训练模型相当的性能。

The prevalence of abusive language on different online platforms has been a major concern that raises the need for automated cross-platform abusive language detection. However, prior works focus on concatenating data from multiple platforms, inherently adopting Empirical Risk Minimization (ERM) method. In this work, we address this challenge from the perspective of domain generalization objective. We design SCL-Fish, a supervised contrastive learning integrated meta-learning algorithm to detect abusive language on unseen platforms. Our experimental analysis shows that SCL-Fish achieves better performance over ERM and the existing state-of-the-art models. We also show that SCL-Fish is data-efficient and achieves comparable performance with the large-scale pre-trained models upon finetuning for the abusive language detection task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题