调查灾难性遗忘在持续的神经机器翻译训练期间

论文标题

调查灾难性遗忘在持续的神经机器翻译训练期间

Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation

论文作者

Gu, Shuhao, Feng, Yang

论文摘要

神经机器翻译（NMT）模型通常会在持续的训练中遭受灾难性遗忘，在这种训练中，这些模型倾向于逐渐忘记以前学习的知识和摇摆，以适合新添加的数据，这些数据可能具有不同的分布，例如一个不同的域。尽管已经提出了许多方法来解决这个问题，但我们无法知道是什么原因导致了这种现象。在域适应的背景下，我们从模块和参数（神经元）的角度研究了灾难性遗忘的原因。对NMT模型模块模块的研究表明，某些模块与通用域知识有着密切的关系，而其他一些模块在域适应性中更为重要。并且对参数的调查表明，某些参数对通用域和内域翻译都很重要，并且在持续训练期间它们的巨大变化会导致通用域的绩效下降。我们跨不同语言对和域进行实验，以确保我们发现的有效性和可靠性。

Neural machine translation (NMT) models usually suffer from catastrophic forgetting during continual training where the models tend to gradually forget previously learned knowledge and swing to fit the newly added data which may have a different distribution, e.g. a different domain. Although many methods have been proposed to solve this problem, we cannot get to know what causes this phenomenon yet. Under the background of domain adaptation, we investigate the cause of catastrophic forgetting from the perspectives of modules and parameters (neurons). The investigation on the modules of the NMT model shows that some modules have tight relation with the general-domain knowledge while some other modules are more essential in the domain adaptation. And the investigation on the parameters shows that some parameters are important for both the general-domain and in-domain translation and the great change of them during continual training brings about the performance decline in general-domain. We conduct experiments across different language pairs and domains to ensure the validity and reliability of our findings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题