Voxceleb扬声器识别挑战2022

论文标题

Voxceleb扬声器识别挑战2022

The DKU-Tencent System for the VoxCeleb Speaker Recognition Challenge 2022

论文作者

Qin, Xiaoyi, Li, Na, Lin, Yuke, Ding, Yiwei, Weng, Chao, Su, Dan, Li, Ming

论文摘要

本文是Voxceleb扬声器识别挑战2022（VOXSRC22）的DKU量化系统的系统描述。在这个挑战中，我们专注于Track1和Track3。对于Track1，采用多个骨干网络来提取帧级特征。由于Track1专注于跨年龄方案，因此我们采用跨年龄试验并执行QMF进行校准评分。基于大小的质量措施取得了很大的改进。对于Track3，半监督域的适应任务，采用了伪标签方法来进行域的适应性。考虑到聚类中的噪声标签，座面被子座座取代。最终提交在任务1中实现了0.107 MDCF，任务3中的EER为7.135％。

This paper is the system description of the DKU-Tencent System for the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC22). In this challenge, we focus on track1 and track3. For track1, multiple backbone networks are adopted to extract frame-level features. Since track1 focus on the cross-age scenarios, we adopt the cross-age trials and perform QMF to calibrate score. The magnitude-based quality measures achieve a large improvement. For track3, the semi-supervised domain adaptation task, the pseudo label method is adopted to make domain adaptation. Considering the noise labels in clustering, the ArcFace is replaced by Sub-center ArcFace. The final submission achieves 0.107 mDCF in task1 and 7.135% EER in task3.

下载PDF全文

下载文献需遵守相关版权规定

论文标题