论文标题

Schme在Semeval-2020任务1:用于检测词汇语义变化的模型合奏

SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical Semantic Change

论文作者

Gruppi, Maurício, Adali, Sibel, Chen, Pin-Yu

论文摘要

本文介绍了Schme(使用模型集合的语义变化检测),这是一种用于词汇语义变化无监督检测的Semeval-2020任务1。 Schme使用模型集合结合了分布模型(单词嵌入式)和文字频率模型的信号,其中每个模型都投票表明单词根据该功能而遭受了语言变化的概率。更具体地说,我们将单词向量的余弦距离与我们命名为映射的邻居距离(MAP)的基于邻域度量的余弦距离,并将单词频率差异度量度量作为我们模型的输入信号。此外,我们探讨了基于一致性的方法来研究此过程中使用的地标的重要性。我们的结果表明,用于对齐的地标的数量对模型的预测性能有直接影响。此外,我们表明,无数痛苦的语义变化的语言往往会受益于使用大量地标,而语言随着语言变化的更多语言而受益于更仔细的地标数字来对齐。

This paper describes SChME (Semantic Change Detection with Model Ensemble), a method usedin SemEval-2020 Task 1 on unsupervised detection of lexical semantic change. SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature. More specifically, we combine cosine distance of wordvectors combined with a neighborhood-based metric we named Mapped Neighborhood Distance(MAP), and a word frequency differential metric as input signals to our model. Additionally,we explore alignment-based methods to investigate the importance of the landmarks used in thisprocess. Our results show evidence that the number of landmarks used for alignment has a directimpact on the predictive performance of the model. Moreover, we show that languages that sufferless semantic change tend to benefit from using a large number of landmarks, whereas languageswith more semantic change benefit from a more careful choice of landmark number for alignment.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源