论文标题
通过锚定参考样品通过序数回归改善发音评估
Improving pronunciation assessment via ordinal regression with anchored reference samples
论文作者
论文摘要
句子级发音评估对于计算机辅助语言学习(Call)很重要。基于发音良好(GOP)算法的传统语音发音评估在评估语音话语方面有一定的弱点:1)音素共和党分数无法轻松地转化为句子评分,并以简单的平均评估来进行有效评估; 2)在共和党评分中尚未很好地利用排序顺序信息,以提供强大的评估并与人类评估者的评估良好相关。在本文中,我们提出了两个新的统计特征,即平均GOP(AGOP)和混乱共和党(CGOP),并使用它们用锚定参考样本(ORAR)(ORARS)训练序数回归中的二进制分类器。当在Microsoft MTUTOR ESL数据集上测试所提出的方法时,对基于GOP的GOP的相对相关系数的相对改善将获得26.9%。表现在人类准则水平或比人类评估者更好。
Sentence level pronunciation assessment is important for Computer Assisted Language Learning (CALL). Traditional speech pronunciation assessment, based on the Goodness of Pronunciation (GOP) algorithm, has some weakness in assessing a speech utterance: 1) Phoneme GOP scores cannot be easily translated into a sentence score with a simple average for effective assessment; 2) The rank ordering information has not been well exploited in GOP scoring for delivering a robust assessment and correlate well with a human rater's evaluations. In this paper, we propose two new statistical features, average GOP (aGOP) and confusion GOP (cGOP) and use them to train a binary classifier in Ordinal Regression with Anchored Reference Samples (ORARS). When the proposed approach is tested on Microsoft mTutor ESL Dataset, a relative improvement of Pearson correlation coefficient of 26.9% is obtained over the conventional GOP-based one. The performance is at a human-parity level or better than human raters.