基于知识蒸馏的后门攻击联合学习

论文标题

基于知识蒸馏的后门攻击联合学习

A Knowledge Distillation-Based Backdoor Attack in Federated Learning

论文作者

Wang, Yifan, Fan, Wei, Yang, Keke, Alhusaini, Naji, Li, Jing

论文摘要

联合学习（FL）是分散机器学习的新型框架。由于FL的分散特征，它很容易受到训练程序中的对抗攻击的影响，例如，后门攻击。后门攻击旨在将后门注入机器学习模型中，以便该模型会在测试样本上任意使用一些特定的后门触发器。即使已经引入了一系列FL的后门攻击方法，但也有针对它们进行防御的方法。许多捍卫方法都利用了带有后门的模型的异常特征，或带有后门和常规模型的模型之间的差异。为了绕过这些防御，我们需要减少差异和异常特征。我们发现这种异常的来源是，后门攻击将在中毒数据时直接翻转数据标签。但是，当前对FL后门攻击的研究并不主要集中于减少带有后门和常规模型模型之间的差异。在本文中，我们提出了对抗性知识蒸馏（ADVKD），一种方法将知识蒸馏与FL中的后门攻击相结合。通过知识蒸馏，我们可以减少标签翻转导致模型中的异常特征，因此该模型可以绕过防御措施。与当前方法相比，我们表明ADVKD不仅可以达到更高的攻击成功率，而且还可以在其他方法失败时成功绕过防御。为了进一步探索ADVKD的性能，我们测试参数如何影响不同情况下的ADVKD的性能。根据实验结果，我们总结了如何在不同情况下调整参数以获得更好的性能。我们还使用几种方法可视化不同攻击的效果并解释Advkd的有效性。

Federated Learning (FL) is a novel framework of decentralized machine learning. Due to the decentralized feature of FL, it is vulnerable to adversarial attacks in the training procedure, e.g. , backdoor attacks. A backdoor attack aims to inject a backdoor into the machine learning model such that the model will make arbitrarily incorrect behavior on the test sample with some specific backdoor trigger. Even though a range of backdoor attack methods of FL has been introduced, there are also methods defending against them. Many of the defending methods utilize the abnormal characteristics of the models with backdoor or the difference between the models with backdoor and the regular models. To bypass these defenses, we need to reduce the difference and the abnormal characteristics. We find a source of such abnormality is that backdoor attack would directly flip the label of data when poisoning the data. However, current studies of the backdoor attack in FL are not mainly focus on reducing the difference between the models with backdoor and the regular models. In this paper, we propose Adversarial Knowledge Distillation(ADVKD), a method combine knowledge distillation with backdoor attack in FL. With knowledge distillation, we can reduce the abnormal characteristics in model result from the label flipping, thus the model can bypass the defenses. Compared to current methods, we show that ADVKD can not only reach a higher attack success rate, but also successfully bypass the defenses when other methods fails. To further explore the performance of ADVKD, we test how the parameters affect the performance of ADVKD under different scenarios. According to the experiment result, we summarize how to adjust the parameter for better performance under different scenarios. We also use several methods to visualize the effect of different attack and explain the effectiveness of ADVKD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题