论文标题
自适应语言建模的扎根成分输出
Grounded Compositional Outputs for Adaptive Language Modeling
论文作者
论文摘要
语言模型已成为NLP中的中心组成部分,并且很大程度上取决于将其廉价地(例如,通过填充)适应新领域和任务的能力。语言模型的词汇$ - 通常在培训之前选择,并在以后永久修复$ - $ - $ - $会影响其尺寸,这是使其对这种适应性具有抵抗力的一部分。先前的工作使用基于表面形式的组成输入嵌入来改善此问题。在这项工作中,我们迈出了一步,并为语言模型提出了一个完全的组成输出嵌入层,该层进一步基于结构化词典(WordNet)的信息,即与语义相关的单词和自由文本定义。据我们所知,结果是第一个单词级语言模型,其大小不取决于训练词汇。我们评估了传统语言建模的模型,并用开放的词汇挑战跨域设置,发现它与以前的先前最先进的输出嵌入方法和适应方法相匹配或胜过。我们的分析将改进归因于样本效率:对于低频单词,我们的模型更准确。
Language models have emerged as a central component across NLP, and a great deal of progress depends on the ability to cheaply adapt them (e.g., through finetuning) to new domains and tasks. A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size and is part of what makes it resistant to such adaptation. Prior work has used compositional input embeddings based on surface forms to ameliorate this issue. In this work, we go one step beyond and propose a fully compositional output embedding layer for language models, which is further grounded in information from a structured lexicon (WordNet), namely semantically related words and free-text definitions. To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary. We evaluate the model on conventional language modeling as well as challenging cross-domain settings with an open vocabulary, finding that it matches or outperforms previous state-of-the-art output embedding methods and adaptation approaches. Our analysis attributes the improvements to sample efficiency: our model is more accurate for low-frequency words.