愚蠢：通过高度不可察觉的对抗性骚扰来愚弄说话者身份

论文标题

愚蠢：通过高度不可察觉的对抗性骚扰来愚弄说话者身份

FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances

论文作者

Shamsabadi, Ali Shahin, Teixeira, Francisco Sepúlveda, Abad, Alberto, Raj, Bhiksha, Cavallaro, Andrea, Trancoso, Isabel

论文摘要

扬声器识别模型容易受到诱发错误分类的输入信号的精心设计的对抗性扰动。在这项工作中，我们提出了一个以白盒隐身为灵感的对抗性攻击，该攻击会产生对扬声器识别模型的不可察觉的对抗性扰动。我们的方法愚蠢使用了在DCT域中运行的封闭式卷积自动编码器，并通过多目标损耗函数进行了训练，以生成和隐藏原始音频文件中的对抗性扰动。除了阻碍扬声器识别性能外，这种多目标损失还通过从原始和对抗性音频文件中提取的MFCC特征向量之间的框架余弦相似性来解释人类感知。我们通过250个扬声器识别X-Vector网络验证了Foolhd的有效性，该网络在准确性，成功率和不可识别方面对使用Voxceleb进行了训练。我们的结果表明，FoolHD产生了高度不可察觉的对抗音频文件（平均PESQ得分高于4.30），而对于误解了说话者识别模型的成功率分别为99.6％和99.2％，分别用于未靶向和目标的设置。

Speaker identification models are vulnerable to carefully designed adversarial perturbations of their input signals that induce misclassification. In this work, we propose a white-box steganography-inspired adversarial attack that generates imperceptible adversarial perturbations against a speaker identification model. Our approach, FoolHD, uses a Gated Convolutional Autoencoder that operates in the DCT domain and is trained with a multi-objective loss function, in order to generate and conceal the adversarial perturbation within the original audio files. In addition to hindering speaker identification performance, this multi-objective loss accounts for human perception through a frame-wise cosine similarity between MFCC feature vectors extracted from the original and adversarial audio files. We validate the effectiveness of FoolHD with a 250-speaker identification x-vector network, trained using VoxCeleb, in terms of accuracy, success rate, and imperceptibility. Our results show that FoolHD generates highly imperceptible adversarial audio files (average PESQ scores above 4.30), while achieving a success rate of 99.6% and 99.2% in misleading the speaker identification model, for untargeted and targeted settings, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题