论文标题
多语言机器翻译的现实
The Reality of Multi-Lingual Machine Translation
论文作者
论文摘要
我们的书《多语言机器翻译的现实》讨论了在机器翻译系统中使用两种以上语言的好处和危险。虽然专注于序列到序列处理和多任务学习的特定任务,但本书的目标是超出自然语言处理领域。对于我们来说,机器翻译是深度学习应用的一个典型示例,其中人类技能和学习能力被视为许多人试图匹配和超越的基准。我们记录到,在多语言翻译中观察到的一些收益可能是由于假定的跨语性知识转移而造成的。 在第一本书中,这本书将引导您探讨多种语言的动机,深度神经网络的多功能性,尤其是在序列到序列任务中与这项学习并发症的顺序。我们以警告过于乐观和不合理的解释对神经网络所展示的收益的解释得出结论。 在第二部分中,我们完全深入研究了多种语言模型,特别仔细地检查了转移学习,这是利用其他语言的更直接的方法之一。对最近的多语言技术(包括大型模型)进行了调查,并讨论了许多语言部署系统的实际方面。结论突出了机器理解的开放问题,并提醒人们建立大规模模型的两个道德方面:研究的包容性及其生态痕迹。
Our book "The Reality of Multi-Lingual Machine Translation" discusses the benefits and perils of using more than two languages in machine translation systems. While focused on the particular task of sequence-to-sequence processing and multi-task learning, the book targets somewhat beyond the area of natural language processing. Machine translation is for us a prime example of deep learning applications where human skills and learning capabilities are taken as a benchmark that many try to match and surpass. We document that some of the gains observed in multi-lingual translation may result from simpler effects than the assumed cross-lingual transfer of knowledge. In the first, rather general part, the book will lead you through the motivation for multi-linguality, the versatility of deep neural networks especially in sequence-to-sequence tasks to complications of this learning. We conclude the general part with warnings against too optimistic and unjustified explanations of the gains that neural networks demonstrate. In the second part, we fully delve into multi-lingual models, with a particularly careful examination of transfer learning as one of the more straightforward approaches utilizing additional languages. The recent multi-lingual techniques, including massive models, are surveyed and practical aspects of deploying systems for many languages are discussed. The conclusion highlights the open problem of machine understanding and reminds of two ethical aspects of building large-scale models: the inclusivity of research and its ecological trace.