通过联合学习培训混合域翻译模型

论文标题

通过联合学习培训混合域翻译模型

Training Mixed-Domain Translation Models via Federated Learning

论文作者

Passban, Peyman, Roosta, Tanya, Gupta, Rahul, Chadha, Ankit, Chung, Clement

论文摘要

培训混合域翻译模型是一项复杂的任务，需要量身定制的架构和昂贵的数据准备技术。在这项工作中，我们利用联邦学习（FL）来解决问题。我们的调查表明，在训练过程中进行了轻微的修改，当将基于FL的聚合应用用于融合不同的域时，可以轻松适应神经机译（NMT）发动机。实验结果还表明，通过FL构建的发动机能够与依靠集中培训技术的最先进的基线相同。我们在存在五个不同大小的数据集（从不同领域）从德语转换为英语，并讨论FL和NMT如何相互受益。除了为FL和NMT结合提供基准测试结果外，我们还提出了一种新型技术，以通过在FL更新期间选择影响的参数来动态控制通信带宽。考虑到需要在FL党之间进行交换的大量NMT发动机，这是一项重大成就。

Training mixed-domain translation models is a complex task that demands tailored architectures and costly data preparation techniques. In this work, we leverage federated learning (FL) in order to tackle the problem. Our investigation demonstrates that with slight modifications in the training process, neural machine translation (NMT) engines can be easily adapted when an FL-based aggregation is applied to fuse different domains. Experimental results also show that engines built via FL are able to perform on par with state-of-the-art baselines that rely on centralized training techniques. We evaluate our hypothesis in the presence of five datasets with different sizes, from different domains, to translate from German into English and discuss how FL and NMT can mutually benefit from each other. In addition to providing benchmarking results on the union of FL and NMT, we also propose a novel technique to dynamically control the communication bandwidth by selecting impactful parameters during FL updates. This is a significant achievement considering the large size of NMT engines that need to be exchanged between FL parties.

下载PDF全文

下载文献需遵守相关版权规定

论文标题