论文标题

部分可观测时空混沌系统的无模型预测

What Knowledge Is Needed? Towards Explainable Memory for kNN-MT Domain Adaptation

论文作者

Zhu, Wenhao, Huang, Shujian, Lv, Yunzhe, Zheng, Xin, Chen, Jiajun

论文摘要

KNN-MT通过构建外部数据存储来提出一个针对域适应的新范式,这通常可以保存所有目标语言令牌在并行语料库中。结果,构造的数据存储通常很大,可能是多余的。在本文中,我们研究了这种方法的可解释性问题:NMT模型需要什么知识?我们将局部正确性(LAC)的概念作为一个新角度,它描述了单个条目和给定邻域的潜在翻译正确性。实证研究表明,我们的研究成功地发现了NMT模型很容易失败并需要相关知识的条件。对六个不同的目标域和两对语言的实验表明,根据局部正确性进行修剪为KNN-MT域的适应带来了轻便,更可解释的记忆。

kNN-MT presents a new paradigm for domain adaptation by building an external datastore, which usually saves all target language token occurrences in the parallel corpus. As a result, the constructed datastore is usually large and possibly redundant. In this paper, we investigate the interpretability issue of this approach: what knowledge does the NMT model need? We propose the notion of local correctness (LAC) as a new angle, which describes the potential translation correctness for a single entry and for a given neighborhood. Empirical study shows that our investigation successfully finds the conditions where the NMT model could easily fail and need related knowledge. Experiments on six diverse target domains and two language-pairs show that pruning according to local correctness brings a light and more explainable memory for kNN-MT domain adaptation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源