论文标题

探索序列到序列变压器变形剂模型以进行关键字斑点

Exploring Sequence-to-Sequence Transformer-Transducer Models for Keyword Spotting

论文作者

Labrador, Beltrán, Zhao, Guanlong, Moreno, Ignacio López, Scarpati, Angelo Scorza, Fowl, Liam, Wang, Quan

论文摘要

在本文中,我们提出了一种新颖的方法,以使序列到序列变压器变形器ASR系统适应关键字斑点(KWS)任务。我们通过用特殊令牌<kw>替换文本转录中的关键字来实现这一目标,并训练系统以检测音频流中的<kw>令牌。在推理时,我们创建了一个受常规KWS方法启发的决策功能,以使我们的方法更适合KWS任务。此外,我们通过适应序列歧视性最小贝叶斯风险训练技术来引入特定的关键字斑点损失。我们发现我们的方法大大优于基于ASR的KWS系统。与传统的关键字斑点系统相比,我们的建议具有相似的性能,同时带来了序列到序列训练的优势和灵活性。此外,当与常规KWS系统结合使用时,我们的方法可以在任何操作点上提高性能。

In this paper, we present a novel approach to adapt a sequence-to-sequence Transformer-Transducer ASR system to the keyword spotting (KWS) task. We achieve this by replacing the keyword in the text transcription with a special token <kw> and training the system to detect the <kw> token in an audio stream. At inference time, we create a decision function inspired by conventional KWS approaches, to make our approach more suitable for the KWS task. Furthermore, we introduce a specific keyword spotting loss by adapting the sequence-discriminative Minimum Bayes-Risk training technique. We find that our approach significantly outperforms ASR based KWS systems. When compared with a conventional keyword spotting system, our proposal has similar performance while bringing the advantages and flexibility of sequence-to-sequence training. Additionally, when combined with the conventional KWS system, our approach can improve the performance at any operation point.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源