使用生成对抗网络的匿名演讲来保存扬声器隐私

论文标题

使用生成对抗网络的匿名演讲来保存扬声器隐私

Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

论文作者

Meyer, Sarina, Tilli, Pascal, Denisov, Pavel, Lux, Florian, Koch, Julia, Vu, Ngoc Thang

论文摘要

为了保护语音数据的隐私，说话者匿名化旨在通过改变语音记录中的语音来隐藏说话者的身份。通常，这是对个人保护和数据可用性之间的隐私性权衡取舍。在这种情况下，挑战之一是创造出尽可能自然的声音。在这项工作中，我们建议通过使用Wasserstein距离作为成本功能的生成对抗网络生成扬声器嵌入来解决此问题。通过将这些人工嵌入到语音到文本到语音管道中，我们在隐私和效用方面都表现优于以前的方法。根据标准的目标指标和人类评估，我们的方法生成了原始录音的可理解且具有内容的隐私版本。

In order to protect the privacy of speech data, speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings. This typically comes with a privacy-utility trade-off between protection of individuals and usability of the data for downstream applications. One of the challenges in this context is to create non-existent voices that sound as natural as possible. In this work, we propose to tackle this issue by generating speaker embeddings using a generative adversarial network with Wasserstein distance as cost function. By incorporating these artificial embeddings into a speech-to-text-to-speech pipeline, we outperform previous approaches in terms of privacy and utility. According to standard objective metrics and human evaluation, our approach generates intelligible and content-preserving yet privacy-protecting versions of the original recordings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题