Semeval-2020任务的PUM 12：基于变压器的模型的汇总，用于进攻语言识别的功能

论文标题

Semeval-2020任务的PUM 12：基于变压器的模型的汇总，用于进攻语言识别的功能

PUM at SemEval-2020 Task 12: Aggregation of Transformer-based models' features for offensive language recognition

论文作者

Janiszewski, Piotr, Skiba, Mateusz, Walińska, Urszula

论文摘要

在本文中，我们描述了PUM团队进入Semeval-2020任务12。创建我们的解决方案涉及利用自然语言处理中使用的两个著名的预审预周化模型：Bert和XLNet，这些模型实现了最新的NLP任务。每个子任务分别对模型进行了微调，并将其隐藏层的特征组合在一起并馈入完全连接的神经网络。使用汇总变压器功能的模型可以作为进攻性语言识别问题的强大工具。我们的团队在子任务C-C-C-C-C-C-C-fistens目标标识中排名第七，其中64.727％的宏F1得分和第85个子任务A-进攻语言识别（89.726％F1-SCORE）中排名第七。

In this paper, we describe the PUM team's entry to the SemEval-2020 Task 12. Creating our solution involved leveraging two well-known pretrained models used in natural language processing: BERT and XLNet, which achieve state-of-the-art results in multiple NLP tasks. The models were fine-tuned for each subtask separately and features taken from their hidden layers were combined and fed into a fully connected neural network. The model using aggregated Transformer features can serve as a powerful tool for offensive language identification problem. Our team was ranked 7th out of 40 in Sub-task C - Offense target identification with 64.727% macro F1-score and 64th out of 85 in Sub-task A - Offensive language identification (89.726% F1-score).

下载PDF全文

下载文献需遵守相关版权规定

论文标题