论文标题
是什么使多语言伯特多语言?
What makes multilingual BERT multilingual?
论文作者
论文摘要
最近,多语言BERT在跨语性转移任务上非常出色,优于静态非上下文的单词嵌入。在这项工作中,我们提供了一项深入的实验研究,以补充现有的跨语性能力文献。我们比较了非上下文和上下文化表示模型与相同数据的跨语性能力。我们发现数据尺寸和上下文窗口大小是可转让性的关键因素。
Recently, multilingual BERT works remarkably well on cross-lingual transfer tasks, superior to static non-contextualized word embeddings. In this work, we provide an in-depth experimental study to supplement the existing literature of cross-lingual ability. We compare the cross-lingual ability of non-contextualized and contextualized representation model with the same data. We found that datasize and context window size are crucial factors to the transferability.