论文标题
使用生成对抗网络的匿名演讲来保存扬声器隐私
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
论文作者
论文摘要
为了保护语音数据的隐私,说话者匿名化旨在通过改变语音记录中的语音来隐藏说话者的身份。通常,这是对个人保护和数据可用性之间的隐私性权衡取舍。在这种情况下,挑战之一是创造出尽可能自然的声音。 在这项工作中,我们建议通过使用Wasserstein距离作为成本功能的生成对抗网络生成扬声器嵌入来解决此问题。通过将这些人工嵌入到语音到文本到语音管道中,我们在隐私和效用方面都表现优于以前的方法。根据标准的目标指标和人类评估,我们的方法生成了原始录音的可理解且具有内容的隐私版本。
In order to protect the privacy of speech data, speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings. This typically comes with a privacy-utility trade-off between protection of individuals and usability of the data for downstream applications. One of the challenges in this context is to create non-existent voices that sound as natural as possible. In this work, we propose to tackle this issue by generating speaker embeddings using a generative adversarial network with Wasserstein distance as cost function. By incorporating these artificial embeddings into a speech-to-text-to-speech pipeline, we outperform previous approaches in terms of privacy and utility. According to standard objective metrics and human evaluation, our approach generates intelligible and content-preserving yet privacy-protecting versions of the original recordings.