论文标题
端到端的口语理解使用树木受限的指针生成器
End-to-end Spoken Language Understanding with Tree-constrained Pointer Generator
论文作者
论文摘要
端到端的口语理解(SLU)遇到了长尾词问题。本文利用了上下文偏见,这是一种在端到端SLU系统中改善稀有单词语音识别的技术。具体而言,研究了树木受限的指针生成器(TCPGEN),一种功能强大且有效的偏置模型组件,该组件利用带有相应实体的插槽候选名单来提取偏置列表。同时,为了偏向SLU模型输出插槽分布,提出了插槽概率偏置(SPB)机制来计算TCPGEN的插槽分布。 SLURP数据集上的实验显示了使用TCPGEN和SPB的SLU-F1的一致性改进,尤其是在看不见的实体上。在新的拆分中,通过拿出5种插槽类型进行测试,与无法处理的基线相比,SPB的TCPGEN以SLU-F1得分超过50%,获得了SLU-F1得分超过50%的零照片学习。除了插槽填充外,意图分类的精度也得到了提高。
End-to-end spoken language understanding (SLU) suffers from the long-tail word problem. This paper exploits contextual biasing, a technique to improve the speech recognition of rare words, in end-to-end SLU systems. Specifically, a tree-constrained pointer generator (TCPGen), a powerful and efficient biasing model component, is studied, which leverages a slot shortlist with corresponding entities to extract biasing lists. Meanwhile, to bias the SLU model output slot distribution, a slot probability biasing (SPB) mechanism is proposed to calculate a slot distribution from TCPGen. Experiments on the SLURP dataset showed consistent SLU-F1 improvements using TCPGen and SPB, especially on unseen entities. On a new split by holding out 5 slot types for the test, TCPGen with SPB achieved zero-shot learning with an SLU-F1 score over 50% compared to baselines which can not deal with it. In addition to slot filling, the intent classification accuracy was also improved.