自动扬声器验证的对抗扬声器蒸馏以进行对策模型

论文标题

自动扬声器验证的对抗扬声器蒸馏以进行对策模型

Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification

论文作者

Liao, Yen-Lun, Chen, Xuanjun, Wang, Chung-Che, Jang, Jyh-Shing Roger

论文摘要

开发了对策（CM）模型，以保护ASV系统免受欺骗攻击，并防止在自动扬声器验证（ASV）系统中导致个人信息泄漏。基于实用性和安全性考虑，CM模型通常部署在边缘设备上，这些设备的计算资源和存储空间比基于云的系统更有限，将模型大小限制在限制下。为了更好地交战CM模型的大小和性能，我们提出了一种对抗性扬声器蒸馏方法，这是一种改进的知识蒸馏方法的版本，结合了广义端到端（GE2E）预读（GE2E）预训练和对抗性微调。在ASVSPOOF 2021逻辑访问任务的评估阶段中，我们提出的对抗说话者蒸馏（ASD-RESNETSE）模型达到0.2695 min T-DCF和3.54％EER。 ASD-Resnetse仅使用了22.5％的参数和19.4％的乘法和累积的Resnetse模型操作数。

The countermeasure (CM) model is developed to protect ASV systems from spoof attacks and prevent resulting personal information leakage in Automatic Speaker Verification (ASV) system. Based on practicality and security considerations, the CM model is usually deployed on edge devices, which have more limited computing resources and storage space than cloud-based systems, confining the model size under a limitation. To better trade off the CM model sizes and performance, we proposed an adversarial speaker distillation method, which is an improved version of knowledge distillation method combined with generalized end-to-end (GE2E) pre-training and adversarial fine-tuning. In the evaluation phase of the ASVspoof 2021 Logical Access task, our proposed adversarial speaker distillation ResNetSE (ASD-ResNetSE) model reaches 0.2695 min t-DCF and 3.54% EER. ASD-ResNetSE only used 22.5% of parameters and 19.4% of multiply and accumulate operands of ResNetSE model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题