端到端的口语理解使用树木受限的指针生成器

论文标题

端到端的口语理解使用树木受限的指针生成器

End-to-end Spoken Language Understanding with Tree-constrained Pointer Generator

论文作者

Sun, Guangzhi, Zhang, Chao, Woodland, Philip C.

论文摘要

端到端的口语理解（SLU）遇到了长尾词问题。本文利用了上下文偏见，这是一种在端到端SLU系统中改善稀有单词语音识别的技术。具体而言，研究了树木受限的指针生成器（TCPGEN），一种功能强大且有效的偏置模型组件，该组件利用带有相应实体的插槽候选名单来提取偏置列表。同时，为了偏向SLU模型输出插槽分布，提出了插槽概率偏置（SPB）机制来计算TCPGEN的插槽分布。 SLURP数据集上的实验显示了使用TCPGEN和SPB的SLU-F1的一致性改进，尤其是在看不见的实体上。在新的拆分中，通过拿出5种插槽类型进行测试，与无法处理的基线相比，SPB的TCPGEN以SLU-F1得分超过50％，获得了SLU-F1得分超过50％的零照片学习。除了插槽填充外，意图分类的精度也得到了提高。

End-to-end spoken language understanding (SLU) suffers from the long-tail word problem. This paper exploits contextual biasing, a technique to improve the speech recognition of rare words, in end-to-end SLU systems. Specifically, a tree-constrained pointer generator (TCPGen), a powerful and efficient biasing model component, is studied, which leverages a slot shortlist with corresponding entities to extract biasing lists. Meanwhile, to bias the SLU model output slot distribution, a slot probability biasing (SPB) mechanism is proposed to calculate a slot distribution from TCPGen. Experiments on the SLURP dataset showed consistent SLU-F1 improvements using TCPGen and SPB, especially on unseen entities. On a new split by holding out 5 slot types for the test, TCPGen with SPB achieved zero-shot learning with an SLU-F1 score over 50% compared to baselines which can not deal with it. In addition to slot filling, the intent classification accuracy was also improved.

下载PDF全文

下载文献需遵守相关版权规定

论文标题