Bagflip：针对数据中毒的认证辩护

论文标题

Bagflip：针对数据中毒的认证辩护

BagFlip: A Certified Defense against Data Poisoning

论文作者

Zhang, Yuhao, Albarghouthi, Aws, D'Antoni, Loris

论文摘要

机器学习模型很容易受到数据毒作攻击的影响，在这种攻击中，攻击者恶意修改了训练设置以改变学习模型的预测。在无触发攻击中，攻击者可以修改训练集但不能修改测试输入，而在后门攻击中，攻击者还可以修改测试输入。现有的模型不足的防御方法要么无法处理后门攻击，要么不提供有效的证书（即防御证明）。我们提出了Bagflip，这是一种模型的认证方法，可以有效地防止无触发和后门攻击。我们评估了图像分类和恶意软件检测数据集的Bagflip。 Bagflip比无触发攻击的最先进方法等于或更有效，并且比用于后门攻击的最先进方法更有效。

Machine learning models are vulnerable to data-poisoning attacks, in which an attacker maliciously modifies the training set to change the prediction of a learned model. In a trigger-less attack, the attacker can modify the training set but not the test inputs, while in a backdoor attack the attacker can also modify test inputs. Existing model-agnostic defense approaches either cannot handle backdoor attacks or do not provide effective certificates (i.e., a proof of a defense). We present BagFlip, a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attacks. We evaluate BagFlip on image classification and malware detection datasets. BagFlip is equal to or more effective than the state-of-the-art approaches for trigger-less attacks and more effective than the state-of-the-art approaches for backdoor attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题