数据增强的增强扬声器入学率用于文本依赖扬声器验证

论文标题

数据增强的增强扬声器入学率用于文本依赖扬声器验证

Data augmentation enhanced speaker enrollment for text-dependent speaker verification

论文作者

Sarkar, Achintya Kumar, Sarma, Himangshu, Dwivedi, Priyanka, Tan, Zheng-Hua

论文摘要

数据增强通常用于从可用的培训数据中生成其他数据，以实现对复杂模型的参数（如说话者验证（SV））的强大估计，尤其是用于资源不足的应用程序。 SV涉及培训说话者独立的（SI）模型和依赖说话者的模型，在该模型中，使用特定说话者在注册阶段为特定说话者的训练数据从SI模型中代表的说话者。尽管对培训模型的数据增强进行了充分的研究，但很少探索说话者入学的数据扩展。在本文中，我们建议使用数据增强方法来生成额外的数据以增强说话者的入学率。每个数据增强方法都会生成一个新的数据集。探索了使用数据集的两种策略：第一个是训练单独的系统并在得分水平上融合它们，另一个是进行多条件培训。此外，我们研究了在嘈杂条件下数据增强的影响。实验是在Reddots Challenge 2016数据库上进行的，结果验证了提出的方法的有效性。

Data augmentation is commonly used for generating additional data from the available training data to achieve a robust estimation of the parameters of complex models like the one for speaker verification (SV), especially for under-resourced applications. SV involves training speaker-independent (SI) models and speaker-dependent models where speakers are represented by models derived from an SI model using the training data for the particular speaker during the enrollment phase. While data augmentation for training SI models is well studied, data augmentation for speaker enrollment is rarely explored. In this paper, we propose the use of data augmentation methods for generating extra data to empower speaker enrollment. Each data augmentation method generates a new data set. Two strategies of using the data sets are explored: the first one is to training separate systems and fuses them at the score level and the other is to conduct multi-conditional training. Furthermore, we study the effect of data augmentation under noisy conditions. Experiments are performed on RedDots challenge 2016 database, and the results validate the effectiveness of the proposed methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题