论文标题

可区分的扬声器匿名化基于共振仪和基本频率缩放

Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling

论文作者

Yao, Jixun, Wang, Qing, Lei, Yi, Guo, Pengcheng, Xie, Lei, Wang, Namin, Liu, Jie

论文摘要

由于社交媒体的出现,互联网上的语音数据正在成倍增长,并且共享此类个人数据引起了明显的安全性和隐私问题。缓解这些问题的一种解决方案是在共享语音数据之前隐藏说话者身份,这也称为说话者匿名化。在我们以前的工作中,我们开发了自动扬声器验证(ASV) - 无模型的匿名框架,以保护扬声器隐私,同时保持语音清晰度。尽管该框架在2022 Voice Privacy 2022挑战中排名第一,但匿名化是不完美的,因为匿名演讲的说话者的区分性恶化了。为了解决这个问题,在本文中,我们直接对代式分布和基本频率(F0)进行建模,以代表说话者身份,并通过均匀缩放的义义仪和F0匿名化源语音。通过直接缩放共振剂和F0,可以防止说话者的匿名语音降解因引入其他说话者而造成的匿名语音。实验结果表明,我们提出的框架可以提高扬声器的区分性,并显着优于我们以前的语音独特性框架。此外,我们提出的方法还可以通过使用不同的缩放因素来权衡隐私性。

Speech data on the Internet are proliferating exponentially because of the emergence of social media, and the sharing of such personal data raises obvious security and privacy concerns. One solution to mitigate these concerns involves concealing speaker identities before sharing speech data, also referred to as speaker anonymization. In our previous work, we have developed an automatic speaker verification (ASV)-model-free anonymization framework to protect speaker privacy while preserving speech intelligibility. Although the framework ranked first place in VoicePrivacy 2022 challenge, the anonymization was imperfect, since the speaker distinguishability of the anonymized speech was deteriorated. To address this issue, in this paper, we directly model the formant distribution and fundamental frequency (F0) to represent speaker identity and anonymize the source speech by the uniformly scaling formant and F0. By directly scaling the formant and F0, the speaker distinguishability degradation of the anonymized speech caused by the introduction of other speakers is prevented. The experimental results demonstrate that our proposed framework can improve the speaker distinguishability and significantly outperforms our previous framework in voice distinctiveness. Furthermore, our proposed method also can trade off the privacy-utility by using different scaling factors.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源