论文标题

Keis@仅在2020年Semeval-2020任务12:使用加权合奏和微调Bert识别多语言进攻推文

KEIS@JUST at SemEval-2020 Task 12: Identifying Multilingual Offensive Tweets Using Weighted Ensemble and Fine-Tuned BERT

论文作者

Tawalbeh, Saja Khaled, Hammad, Mahmoud, AL-Smadi, Mohammad

论文摘要

这项研究介绍了我们的团队Keis@仅参加Semeval-2020任务12,该任务代表了多语言攻击语言的共同任务。我们参与了除英语子任务以外的所有子任务的所有提供的语言。已经开发了两种主要方法,首先是针对阿拉伯语和英语的语言,由Bi-Gru和CNN组成,然后是高斯噪声,全球池层乘以重量以改善整体性能。第二种是针对其他语言执行的,这是从Bi-LSTM和Bi-Gru等复发性神经网络旁边的BERT进行转移学习,然后是全球平均合并层。单词嵌入和上下文嵌入已被用作特征,此外,数据增强仅用于阿拉伯语。

This research presents our team KEIS@JUST participation at SemEval-2020 Task 12 which represents shared task on multilingual offensive language. We participated in all the provided languages for all subtasks except sub-task-A for the English language. Two main approaches have been developed the first is performed to tackle both languages Arabic and English, a weighted ensemble consists of Bi-GRU and CNN followed by Gaussian noise and global pooling layer multiplied by weights to improve the overall performance. The second is performed for other languages, a transfer learning from BERT beside the recurrent neural networks such as Bi-LSTM and Bi-GRU followed by a global average pooling layer. Word embedding and contextual embedding have been used as features, moreover, data augmentation has been used only for the Arabic language.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源