论文标题
反对说话者验证的后门攻击
Backdoor Attack against Speaker Verification
论文作者
论文摘要
演讲者验证已在许多关键任务领域被广泛而成功地采用,以供用户识别。对说话者验证的培训需要大量数据,因此用户通常需要采用第三方数据(例如$ $,来自Internet或第三方数据公司的数据)。这就提出了一个问题,即采用不信任的第三方数据是否会构成安全威胁。在本文中,我们证明可以通过毒害训练数据来注入隐藏的后门,以感染扬声器验证模型。具体来说,我们设计了一种基于聚类的攻击方案,其中根据我们对验证任务的理解,来自不同集群的中毒样本将包含不同的触发器(即$,预定的话语)。受感染的模型通常在良性样本上行为,而攻击者指定的未注册触发器也将成功通过验证,即使攻击者没有有关注册扬声器的信息。我们还证明,在攻击扬声器验证中不能直接采用现有的后门攻击。我们的方法不仅为设计新攻击提供了新的观点,而且还可以作为改善验证方法鲁棒性的强大基准。复制主要结果的代码可在\ url {https://github.com/zhaitongqing233/backdoor-atchack-against-speaker-verification}获得。
Speaker verification has been widely and successfully adopted in many mission-critical areas for user identification. The training of speaker verification requires a large amount of data, therefore users usually need to adopt third-party data ($e.g.$, data from the Internet or third-party data company). This raises the question of whether adopting untrusted third-party data can pose a security threat. In this paper, we demonstrate that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data. Specifically, we design a clustering-based attack scheme where poisoned samples from different clusters will contain different triggers ($i.e.$, pre-defined utterances), based on our understanding of verification tasks. The infected models behave normally on benign samples, while attacker-specified unenrolled triggers will successfully pass the verification even if the attacker has no information about the enrolled speaker. We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification. Our approach not only provides a new perspective for designing novel attacks, but also serves as a strong baseline for improving the robustness of verification methods. The code for reproducing main results is available at \url{https://github.com/zhaitongqing233/Backdoor-attack-against-speaker-verification}.