论文标题

用于野外端到端发言人认可的暹罗胶囊网络

Siamese Capsule Network for End-to-End Speaker Recognition In The Wild

论文作者

Hajavi, Amirhossein, Etemad, Ali

论文摘要

我们为野外说话者验证提供了一个端到端的深层模型。我们的模型使用薄度测量网络从发声中提取扬声器嵌入,暹罗胶囊网络和动态路由作为后端,以计算嵌入式之间的相似性分数。我们对模型与最先进的解决方案进行了一系列实验和比较,这表明我们的模型使用大量的培训数据胜过所有其他模型。我们还执行其他实验,以研究不同扬声器嵌入对暹罗胶囊网络的影响。我们表明,通过使用直接从前端的特征聚合模块获得的嵌入来实现最佳性能,并使用动态路由将它们传递到更高的胶囊中。

We propose an end-to-end deep model for speaker verification in the wild. Our model uses thin-ResNet for extracting speaker embeddings from utterances and a Siamese capsule network and dynamic routing as the Back-end to calculate a similarity score between the embeddings. We conduct a series of experiments and comparisons on our model to state-of-the-art solutions, showing that our model outperforms all the other models using substantially less amount of training data. We also perform additional experiments to study the impact of different speaker embeddings on the Siamese capsule network. We show that the best performance is achieved by using embeddings obtained directly from the feature aggregation module of the Front-end and passing them to higher capsules using dynamic routing.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源