使用零拍的多语言机器翻译技术来解决语音翻译中的数据稀缺

论文标题

使用零拍的多语言机器翻译技术来解决语音翻译中的数据稀缺

Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniques

论文作者

Dinh, Tu Anh, Liu, Danni, Niehues, Jan

论文摘要

最近，端到端语音翻译（ST）避免了错误传播，因此引起了极大的关注。但是，该方法遇到了数据稀缺。它在很大程度上取决于直接的ST数据，并且在使用语音转录和文本翻译数据方面效率较低，这通常更容易获得。在多语言文本翻译的相关领域中，已经提出了几种用于零击翻译的技术。一个主要思想是增加不同语言语言上类似句子的相似性。我们通过建立对语音转录和文本翻译数据训练的ST模型来研究这些想法是否可以应用于语音翻译。我们研究了数据增强和辅助损失函数的影响。与直接端到端ST和+3.1 BLEU点相比，与从ASR模型进行了微调相比，使用有限的ST数据成功地应用了几杆ST，高达+12.9 BLEU点的改进。

Recently, end-to-end speech translation (ST) has gained significant attention as it avoids error propagation. However, the approach suffers from data scarcity. It heavily depends on direct ST data and is less efficient in making use of speech transcription and text translation data, which is often more easily available. In the related field of multilingual text translation, several techniques have been proposed for zero-shot translation. A main idea is to increase the similarity of semantically similar sentences in different languages. We investigate whether these ideas can be applied to speech translation, by building ST models trained on speech transcription and text translation data. We investigate the effects of data augmentation and auxiliary loss function. The techniques were successfully applied to few-shot ST using limited ST data, with improvements of up to +12.9 BLEU points compared to direct end-to-end ST and +3.1 BLEU points compared to ST models fine-tuned from ASR model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题