论文标题

使用最近的邻居对比学习的语音序列嵌入

Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning

论文作者

Algayres, Robin, Nabli, Adel, Sagot, Benoit, Dupoux, Emmanuel

论文摘要

我们介绍了一个简单的神经编码器体系结构,可以使用无监督的对比学习目标对其进行训练,该目标从数据增强的K-Nearthiend邻居搜索中获取积极样本。我们表明,当建立在最近的自我监管的音频表示之上时,该方法可以迭代地应用,并在两个任务上评估竞争性SSE:逐个典型的语音序列和口语术语发现。在这两个任务上,我们的方法都通过5种不同语言的大幅度来推动最先进的余量。最后,我们在LibrisPeech数据集上的逐个示例任务上建立了一个基准测试,以监视该领域的未来改进。

We introduce a simple neural encoder architecture that can be trained using an unsupervised contrastive learning objective which gets its positive samples from data-augmented k-Nearest Neighbors search. We show that when built on top of recent self-supervised audio representations, this method can be applied iteratively and yield competitive SSE as evaluated on two tasks: query-by-example of random sequences of speech, and spoken term discovery. On both tasks our method pushes the state-of-the-art by a significant margin across 5 different languages. Finally, we establish a benchmark on a query-by-example task on the LibriSpeech dataset to monitor future improvements in the field.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源