论文标题
单词感应诱导层次结构聚类和共同信息最大化
Word Sense Induction with Hierarchical Clustering and Mutual Information Maximization
论文作者
论文摘要
在自然语言处理中,单词感官诱导(WSI)是一个困难的问题,它涉及对单词感官的无监督自动检测(即含义)。最近的工作通过预先培训可以完全消除单词感官的语言模型,在WSI任务上取得了重大结果,而其他人则采用了以前训练的语言模型以及其他策略来引起感觉。在本文中,我们提出了一种基于分层聚类和不变信息聚类(IIC)的新型无监督方法。 IIC用于训练一个小型模型,以优化一对合成释义中发生的目标词的两个向量表示之间的相互信息。后来在推理模式下使用该模型来提取用于分层聚类中使用的高质量矢量表示。我们在两个WSI任务以及两个不同的聚类配置(固定和动态簇数)上评估了我们的方法。我们从经验上证明,在某些情况下,我们的方法的表现优于先前的WSI最先进的方法,而在其他情况下,它取得了竞争性的表现。
Word sense induction (WSI) is a difficult problem in natural language processing that involves the unsupervised automatic detection of a word's senses (i.e. meanings). Recent work achieves significant results on the WSI task by pre-training a language model that can exclusively disambiguate word senses, whereas others employ previously pre-trained language models in conjunction with additional strategies to induce senses. In this paper, we propose a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC). The IIC is used to train a small model to optimize the mutual information between two vector representations of a target word occurring in a pair of synthetic paraphrases. This model is later used in inference mode to extract a higher quality vector representation to be used in the hierarchical clustering. We evaluate our method on two WSI tasks and in two distinct clustering configurations (fixed and dynamic number of clusters). We empirically demonstrate that, in certain cases, our approach outperforms prior WSI state-of-the-art methods, while in others, it achieves a competitive performance.