论文标题

具有多语言同系系统的通用电话识别

Universal Phone Recognition with a Multilingual Allophone System

论文作者

Li, Xinjian, Dalmia, Siddharth, Li, Juncheng, Lee, Matthew, Littell, Patrick, Yao, Jiali, Anastasopoulos, Antonios, Mortensen, David R., Neubig, Graham, Black, Alan W, Metze, Florian

论文摘要

多语言模型可以通过跨语言共享参数来改善语言处理,尤其是对于低资源情况。但是,多语言声学模型通常忽略了音素之间的差异(可以支持特定语言的词汇对比的声音)及其相应的手机(实际上是语言独立的声音)。在组合各种训练语言时,这可能会导致性能降解,因为相同注释的音素实际上可以与几种不同的基本语音实现相对应。在这项工作中,我们提出了一个独立于语言的电话和语言依赖性音素分布的联合模型。在11种语言的多语言ASR实验中,我们发现该模型在低资源条件下将测试性能提高了2%的音素错误率绝对。此外,由于我们明确地对语言无关的电话进行了建模,因此我们可以建立一个(几乎)通用的电话识别器,当与可使用的大型,手动策划的电话清单数据库结合使用时,可以将其定制为2,000个依赖语言的识别者。在两种低资源的土著语言Inuktitut和Tusom上进行的实验表明,我们的识别器可以提高电话精度超过17%,使世界上所有语言的语音识别更近。

Multilingual models can improve language processing, particularly for low resource situations, by sharing parameters across languages. Multilingual acoustic models, however, generally ignore the difference between phonemes (sounds that can support lexical contrasts in a particular language) and their corresponding phones (the sounds that are actually spoken, which are language independent). This can lead to performance degradation when combining a variety of training languages, as identically annotated phonemes can actually correspond to several different underlying phonetic realizations. In this work, we propose a joint model of both language-independent phone and language-dependent phoneme distributions. In multilingual ASR experiments over 11 languages, we find that this model improves testing performance by 2% phoneme error rate absolute in low-resource conditions. Additionally, because we are explicitly modeling language-independent phones, we can build a (nearly-)universal phone recognizer that, when combined with the PHOIBLE large, manually curated database of phone inventories, can be customized into 2,000 language dependent recognizers. Experiments on two low-resourced indigenous languages, Inuktitut and Tusom, show that our recognizer achieves phone accuracy improvements of more than 17%, moving a step closer to speech recognition for all languages in the world.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源