论文标题

使用生成对抗网络的匿名演讲来保存扬声器隐私

Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

论文作者

Meyer, Sarina, Tilli, Pascal, Denisov, Pavel, Lux, Florian, Koch, Julia, Vu, Ngoc Thang

论文摘要

为了保护语音数据的隐私,说话者匿名化旨在通过改变语音记录中的语音来隐藏说话者的身份。通常,这是对个人保护和数据可用性之间的隐私性权衡取舍。在这种情况下,挑战之一是创造出尽可能自然的声音。 在这项工作中,我们建议通过使用Wasserstein距离作为成本功能的生成对抗网络生成扬声器嵌入来解决此问题。通过将这些人工嵌入到语音到文本到语音管道中,我们在隐私和效用方面都表现优于以前的方法。根据标准的目标指标和人类评估,我们的方法生成了原始录音的可理解且具有内容的隐私版本。

In order to protect the privacy of speech data, speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings. This typically comes with a privacy-utility trade-off between protection of individuals and usability of the data for downstream applications. One of the challenges in this context is to create non-existent voices that sound as natural as possible. In this work, we propose to tackle this issue by generating speaker embeddings using a generative adversarial network with Wasserstein distance as cost function. By incorporating these artificial embeddings into a speech-to-text-to-speech pipeline, we outperform previous approaches in terms of privacy and utility. According to standard objective metrics and human evaluation, our approach generates intelligible and content-preserving yet privacy-protecting versions of the original recordings.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源