论文标题
在多语言伯特中找到普遍的语法关系
Finding Universal Grammatical Relations in Multilingual BERT
论文作者
论文摘要
最近的工作发现证据表明,基于变压器的多语言蒙版模型多语言伯特(Mbert)能够进行零拍的跨语性转移,这表明其表示的某些方面在交叉上共享。为了更好地理解这种重叠,我们将在神经网络的内部表示中找到句法树的最新工作扩展到了多语言环境。我们表明,Mbert表示子空间以英语以外的其他语言恢复了句法树的距离,并且这些子空间大约在语言上共享。在这些结果的推动下,我们提出了一种无监督的分析方法,该方法提供了梅伯特(Mbert)学习句法依赖性标签的表示,并以群集的形式在很大程度上与普遍的依赖性分类法一致。该证据表明,即使没有明确的监督,多语言蒙面的语言模型也可以学习某些语言普遍性。
Recent work has found evidence that Multilingual BERT (mBERT), a transformer-based multilingual masked language model, is capable of zero-shot cross-lingual transfer, suggesting that some aspects of its representations are shared cross-lingually. To better understand this overlap, we extend recent work on finding syntactic trees in neural networks' internal representations to the multilingual setting. We show that subspaces of mBERT representations recover syntactic tree distances in languages other than English, and that these subspaces are approximately shared across languages. Motivated by these results, we present an unsupervised analysis method that provides evidence mBERT learns representations of syntactic dependency labels, in the form of clusters which largely agree with the Universal Dependencies taxonomy. This evidence suggests that even without explicit supervision, multilingual masked language models learn certain linguistic universals.