相邻的后门攻击在图形卷积网络上

论文标题

相邻的后门攻击在图形卷积网络上

Neighboring Backdoor Attacks on Graph Convolutional Network

论文作者

Chen, Liang, Peng, Qibiao, Li, Jintang, Liu, Yang, Chen, Jiawei, Li, Yong, Zheng, Zibin

论文摘要

已经对后门攻击进行了广泛的研究，以将错误分类规则隐藏在正常模型中，只有在模型了解特定输入（即触发器）时才被激活。但是，尽管在常规的欧几里得空间中取得了成功，但对图结构数据的后门攻击很少。在本文中，我们提出了一种特定于图形数据的新型后门，称为相邻的后门。考虑到图形数据的离散性，如何在原始任务上保留模型精度的同时有效设计触发器是主要挑战。为了应对这样的挑战，我们将触发器设置为单个节点，并在将触发节点连接到目标节点时激活后门。为了保持模型精度，不允许修改模型参数。因此，当未连接触发节点时，模型会正常执行。在这些设置下，在这项工作中，我们专注于生成触发节点的功能。提出了两种类型的后门：（1）线性图卷积后门，通过查看GCN的线性部分，找到特征生成的近似解决方案（可以将其视为整数编程问题）。（2）现有图形攻击的变体。我们将当前基于梯度的攻击方法扩展到后门攻击方案。在两个社交网络和两个引用网络数据集上进行的广泛实验表明，所有提议的后门都可以达到几乎100 \％的攻击成功率，而对预测精度没有影响。

Backdoor attacks have been widely studied to hide the misclassification rules in the normal models, which are only activated when the model is aware of the specific inputs (i.e., the trigger). However, despite their success in the conventional Euclidean space, there are few studies of backdoor attacks on graph structured data. In this paper, we propose a new type of backdoor which is specific to graph data, called neighboring backdoor. Considering the discreteness of graph data, how to effectively design the triggers while retaining the model accuracy on the original task is the major challenge. To address such a challenge, we set the trigger as a single node, and the backdoor is activated when the trigger node is connected to the target node. To preserve the model accuracy, the model parameters are not allowed to be modified. Thus, when the trigger node is not connected, the model performs normally. Under these settings, in this work, we focus on generating the features of the trigger node. Two types of backdoors are proposed: (1) Linear Graph Convolution Backdoor which finds an approximation solution for the feature generation (can be viewed as an integer programming problem) by looking at the linear part of GCNs. (2) Variants of existing graph attacks. We extend current gradient-based attack methods to our backdoor attack scenario. Extensive experiments on two social networks and two citation networks datasets demonstrate that all proposed backdoors can achieve an almost 100\% attack success rate while having no impact on predictive accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题