通过混合式跨维随机语言模型整合离散和神经特征

论文标题

通过混合式跨维随机语言模型整合离散和神经特征

Integrating Discrete and Neural Features via Mixed-feature Trans-dimensional Random Field Language Models

论文作者

Gao, Silin, Ou, Zhijian, Yang, Wei, Xu, Huifang

论文摘要

人们已经有一个很长的认识，即离散功能（N-gram特征）和基于神经网络的功能具有语言模型（LMS）的互补优势。可以通过模型插值获得改进的性能，但是，这是离散和神经特征的次优的两步整合。跨维随机场（TRF）框架具有能够灵活整合一组更丰富的功能的潜在优势。但是，先前的TRF LMS单独使用离散或神经特征。本文开发了混合功能的TRF LM，并在整合离散和神经特征方面证明了其优势。通过PTB和Google进行了一十亿个字的数据集，对各种LM进行了培训，并在N-最佳列表中进行了评估，以重新撤销实验，以识别语音识别。在所有单个LMS（即没有模型插值）中，混合功能的TRF LMS表现最好，仅在离散的TRF LMS和神经TRF LMS上都改善，并且也比LSTM LMS好得多。与分别具有离散和神经特征的单独训练的模型进行插值相比，混合功能TRF LMS的性能与最佳的插值模型相匹配，并且具有简化的一步训练过程和减少的训练时间。

There has been a long recognition that discrete features (n-gram features) and neural network based features have complementary strengths for language models (LMs). Improved performance can be obtained by model interpolation, which is, however, a suboptimal two-step integration of discrete and neural features. The trans-dimensional random field (TRF) framework has the potential advantage of being able to flexibly integrate a richer set of features. However, either discrete or neural features are used alone in previous TRF LMs. This paper develops a mixed-feature TRF LM and demonstrates its advantage in integrating discrete and neural features. Various LMs are trained over PTB and Google one-billion-word datasets, and evaluated in N-best list rescoring experiments for speech recognition. Among all single LMs (i.e. without model interpolation), the mixed-feature TRF LMs perform the best, improving over both discrete TRF LMs and neural TRF LMs alone, and also being significantly better than LSTM LMs. Compared to interpolating two separately trained models with discrete and neural features respectively, the performance of mixed-feature TRF LMs matches the best interpolated model, and with simplified one-step training process and reduced training time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题