论文标题
中文伯特是否编码单词结构?
Does Chinese BERT Encode Word Structure?
论文作者
论文摘要
上下文化表示为广泛的NLP任务提供了显着改善的结果。已经致力于分析代表模型(例如Bert)捕获的功能的许多工作。现有工作发现句法,语义和单词感知知识是在伯特中编码的。但是,很少的工作研究了基于角色的语言(例如中文)的单词功能。我们使用注意力重量分布统计和探测任务研究中国伯特,发现(1)单词信息由Bert捕获; (2)单词级特征主要在中间表示层中; (3)下游任务在BERT中对单词功能进行了不同的使用,而POS标记和分解最依赖于单词功能,而自然语言推断依赖于此类功能。
Contextualized representations give significantly improved results for a wide range of NLP tasks. Much work has been dedicated to analyzing the features captured by representative models such as BERT. Existing work finds that syntactic, semantic and word sense knowledge are encoded in BERT. However, little work has investigated word features for character-based languages such as Chinese. We investigate Chinese BERT using both attention weight distribution statistics and probing tasks, finding that (1) word information is captured by BERT; (2) word-level features are mostly in the middle representation layers; (3) downstream tasks make different use of word features in BERT, with POS tagging and chunking relying the most on word features, and natural language inference relying the least on such features.