论文标题
低资源跨语言命名实体识别的双对逆性框架
A Dual-Contrastive Framework for Low-Resource Cross-Lingual Named Entity Recognition
论文作者
论文摘要
跨语言命名实体识别(NER)最近已成为研究热点,因为它可以减轻低资源语言的数据渴望问题。但是,很少有研究集中在某些特定领域中源语言标记的数据也受到限制的情况。这种情况的一种常见方法是通过转换或基于生成的数据增强方法生成更多的培训数据。不幸的是,我们发现仅将源语言数据组合在一起,相应的翻译无法完全利用翻译数据,并且获得的改进受到限制。在本文中,我们描述了在有限源标记的数据的情况下,我们的新颖的双重对比框架相关的跨语言NER。具体而言,基于源语言样本及其翻译,我们在不同的语法层面上设计了两个对跨语言的对比目标,即翻译对比度学习(TCL)以在相同标签中的紧密代表表示之间的翻译句子对和标签对比度学习(LCL)之间的紧密句子表示。此外,我们还利用知识蒸馏方法,其中训练的NER模型被用作教师,以训练学生模型在未标记的目标语言数据上以更好地适合目标语言。我们对各种目标语言进行了广泛的实验,结果表明,关注者倾向于优于多个基线方法。为了获得可重复性,我们的本文代码可在https://github.com/gklmip/concner上找到。
Cross-lingual Named Entity Recognition (NER) has recently become a research hotspot because it can alleviate the data-hungry problem for low-resource languages. However, few researches have focused on the scenario where the source-language labeled data is also limited in some specific domains. A common approach for this scenario is to generate more training data through translation or generation-based data augmentation method. Unfortunately, we find that simply combining source-language data and the corresponding translation cannot fully exploit the translated data and the improvements obtained are somewhat limited. In this paper, we describe our novel dual-contrastive framework ConCNER for cross-lingual NER under the scenario of limited source-language labeled data. Specifically, based on the source-language samples and their translations, we design two contrastive objectives for cross-language NER at different grammatical levels, namely Translation Contrastive Learning (TCL) to close sentence representations between translated sentence pairs and Label Contrastive Learning (LCL) to close token representations within the same labels. Furthermore, we utilize knowledge distillation method where the NER model trained above is used as the teacher to train a student model on unlabeled target-language data to better fit the target language. We conduct extensive experiments on a wide variety of target languages, and the results demonstrate that ConCNER tends to outperform multiple baseline methods. For reproducibility, our code for this paper is available at https://github.com/GKLMIP/ConCNER.