论文标题
Bagflip:针对数据中毒的认证辩护
BagFlip: A Certified Defense against Data Poisoning
论文作者
论文摘要
机器学习模型很容易受到数据毒作攻击的影响,在这种攻击中,攻击者恶意修改了训练设置以改变学习模型的预测。在无触发攻击中,攻击者可以修改训练集但不能修改测试输入,而在后门攻击中,攻击者还可以修改测试输入。现有的模型不足的防御方法要么无法处理后门攻击,要么不提供有效的证书(即防御证明)。我们提出了Bagflip,这是一种模型的认证方法,可以有效地防止无触发和后门攻击。我们评估了图像分类和恶意软件检测数据集的Bagflip。 Bagflip比无触发攻击的最先进方法等于或更有效,并且比用于后门攻击的最先进方法更有效。
Machine learning models are vulnerable to data-poisoning attacks, in which an attacker maliciously modifies the training set to change the prediction of a learned model. In a trigger-less attack, the attacker can modify the training set but not the test inputs, while in a backdoor attack the attacker can also modify test inputs. Existing model-agnostic defense approaches either cannot handle backdoor attacks or do not provide effective certificates (i.e., a proof of a defense). We present BagFlip, a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attacks. We evaluate BagFlip on image classification and malware detection datasets. BagFlip is equal to or more effective than the state-of-the-art approaches for trigger-less attacks and more effective than the state-of-the-art approaches for backdoor attacks.