LDL：针对基于标签的会员推理攻击的辩护

论文标题

LDL：针对基于标签的会员推理攻击的辩护

LDL: A Defense for Label-Based Membership Inference Attacks

论文作者

Rajabi, Arezoo, Sahabandu, Dinuka, Niu, Luyao, Ramasubramanian, Bhaskar, Poovendran, Radha

论文摘要

用于培训医疗保健和金融等应用中深层神经网络（DNN）模型的数据通常包含敏感信息。 DNN模型可能会过度拟合。已证明过度拟合的模型容易受到基于查询的攻击，例如会员推理攻击（MIAS）。 MIAS的目的是确定样本是否属于用于训练分类器（成员）（非会员）的数据集。最近，提出了一类新的基于标签的MIA（LAB MIA），其中只需要一个对手才能了解样品的预测标签。制定针对无法进行DNN模型实验室MIA的对手进行的防御仍然是一个悬而未决的问题。我们提出了LDL，这是针对实验室MIA的轻巧防御。 LDL通过在查询样品周围构造高维球体来起作用，以使模型决策在球体内的样品的（嘈杂）变体不变。标签不变性的这个领域会产生歧义，并防止查询对手正确确定样本是成员还是非成员。我们分析地表征了在部署LDL时进行实验室MIA的对手的成功率，并表明该公式与实验观测一致。我们在七个数据集中评估了LDL-CIFAR-10，CIFAR-100，GTSRB，FACE，购买，位置和德克萨斯州 - 具有不同尺寸的培训数据。所有这些数据集已由SOTA LAB MIAS使用。我们的实验表明，在每种情况下，LDL降低了进行实验室MIA的对手的成功率。我们从经验上将LDL与需要进行DNN模型重新培训的LAB MIA的防御能力进行了比较，并表明尽管不需要重新培训DNN，但LDL表现出色。

The data used to train deep neural network (DNN) models in applications such as healthcare and finance typically contain sensitive information. A DNN model may suffer from overfitting. Overfitted models have been shown to be susceptible to query-based attacks such as membership inference attacks (MIAs). MIAs aim to determine whether a sample belongs to the dataset used to train a classifier (members) or not (nonmembers). Recently, a new class of label based MIAs (LAB MIAs) was proposed, where an adversary was only required to have knowledge of predicted labels of samples. Developing a defense against an adversary carrying out a LAB MIA on DNN models that cannot be retrained remains an open problem. We present LDL, a light weight defense against LAB MIAs. LDL works by constructing a high-dimensional sphere around queried samples such that the model decision is unchanged for (noisy) variants of the sample within the sphere. This sphere of label-invariance creates ambiguity and prevents a querying adversary from correctly determining whether a sample is a member or a nonmember. We analytically characterize the success rate of an adversary carrying out a LAB MIA when LDL is deployed, and show that the formulation is consistent with experimental observations. We evaluate LDL on seven datasets -- CIFAR-10, CIFAR-100, GTSRB, Face, Purchase, Location, and Texas -- with varying sizes of training data. All of these datasets have been used by SOTA LAB MIAs. Our experiments demonstrate that LDL reduces the success rate of an adversary carrying out a LAB MIA in each case. We empirically compare LDL with defenses against LAB MIAs that require retraining of DNN models, and show that LDL performs favorably despite not needing to retrain the DNNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题