CMNERONE在Semeval-2022任务11：通过利用多语言数据的代码混合命名实体识别

论文标题

CMNERONE在Semeval-2022任务11：通过利用多语言数据的代码混合命名实体识别

CMNEROne at SemEval-2022 Task 11: Code-Mixed Named Entity Recognition by leveraging multilingual data

论文作者

Dowlagar, Suman, Mamidi, Radhika

论文摘要

通常，在自然语言处理领域，识别指定实体是一项实用且具有挑战性的任务。由于混合的性质导致语言复杂性，因此在代码混合文本上命名的实体识别是进一步的挑战。本文介绍了CMNERONE团队的提交到Semeval 2022共享任务11 Multiconer。代码混合的NER任务旨在识别代码混合数据集中的命名实体。我们的工作包括在代码混合数据集上的命名实体识别（NER），来利用多语言数据。我们的加权平均F1得分为0.7044，即比基线高6％。

Identifying named entities is, in general, a practical and challenging task in the field of Natural Language Processing. Named Entity Recognition on the code-mixed text is further challenging due to the linguistic complexity resulting from the nature of the mixing. This paper addresses the submission of team CMNEROne to the SEMEVAL 2022 shared task 11 MultiCoNER. The Code-mixed NER task aimed to identify named entities on the code-mixed dataset. Our work consists of Named Entity Recognition (NER) on the code-mixed dataset by leveraging the multilingual data. We achieved a weighted average F1 score of 0.7044, i.e., 6% greater than the baseline.

下载PDF全文

下载文献需遵守相关版权规定

论文标题