论文标题
OOV问题的发音管道的好处
Goodness of Pronunciation Pipelines for OOV Problem
论文作者
论文摘要
在以下报告中,我们提出了使用词汇/词汇扩展技术在测试时间的发音良好(GOP)计算问题的管道。该管道使用ASR系统的不同组件来量化重音并自动评估它们作为得分。我们使用经过本地英语演讲训练的ASR模型的后代以及电话级别的边界来获得电话级发音分数。我们将其用作基线管道,并实现了通过构建三个管道来删除GOP输出中的UNK和SPN音素的方法。在线,离线和混合管道返回分数但也可以防止最终输出中未知单词。在线方法是根据话语,离线方法为给定数据集提供了一组OOV单词,而混合方法结合了上述两个想法,以扩大词典以及每个说法的作品。我们进一步提供了诸如音素的实用程序,例如后映射,每种话语作为向量的共和党分数以及共和党管道中使用的单词边界,以用于未来的研究。
In the following report we propose pipelines for Goodness of Pronunciation (GoP) computation solving OOV problem at testing time using Vocab/Lexicon expansion techniques. The pipeline uses different components of ASR system to quantify accent and automatically evaluate them as scores. We use the posteriors of an ASR model trained on native English speech, along with the phone level boundaries to obtain phone level pronunciation scores. We used this as a baseline pipeline and implemented methods to remove UNK and SPN phonemes in the GoP output by building three pipelines. The Online, Offline and Hybrid pipeline which returns the scores but also can prevent unknown words in the final output. The Online method is based per utterance, Offline method pre-incorporates a set of OOV words for a given data set and the Hybrid method combines the above two ideas to expand the lexicon as well work per utterance. We further provide utilities such as the Phoneme to posterior mappings, GoP scores of each utterance as a vector, and Word boundaries used in the GoP pipeline for use in future research.