论文标题
打开黑匣子的神经机器翻译:变压器的来源和目标解释
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer
论文作者
论文摘要
在神经机器翻译(NMT)中,每个令牌预测都在源句子和目标前缀(以前在解码步骤中翻译过)。但是,NMT中有关可解释性的先前工作主要集中在源句子令牌的属性上。因此,我们缺乏对模型预测中每个输入令牌(源句子和目标前缀)的影响的完全理解。在这项工作中,我们提出了一种可解释性方法,该方法可以跟踪两个上下文的输入令牌属性。我们的方法可以扩展到任何基于编码器的模型,使我们能够更好地理解当前NMT模型的内部工作。我们将提出的方法应用于双语和多语言变压器,并介绍其对其行为的见解。
In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix (what has been previously translated at a decoding step). However, previous work on interpretability in NMT has mainly focused solely on source sentence tokens' attributions. Therefore, we lack a full understanding of the influences of every input token (source sentence and target prefix) in the model predictions. In this work, we propose an interpretability method that tracks input tokens' attributions for both contexts. Our method, which can be extended to any encoder-decoder Transformer-based model, allows us to better comprehend the inner workings of current NMT models. We apply the proposed method to both bilingual and multilingual Transformers and present insights into their behaviour.