advmind：推断黑盒攻击的对手意图

论文标题

advmind：推断黑盒攻击的对手意图

AdvMind: Inferring Adversary Intent of Black-Box Attacks

论文作者

Pang, Ren, Zhang, Xinyang, Ji, Shouling, Luo, Xiapu, Wang, Ting

论文摘要

即使在黑框设置下，深层神经网络（DNN）也固有地容易受到对抗攻击的影响，在黑框设置下，对手只能对目标模型进行查询访问。在实践中，虽然有可能有效地检测这种攻击（例如观察大量相似但非相同的查询），但恰好确切地推断对手意图（例如，对抗性示例的目标类别尤其是在攻击的早期阶段，对攻击的早期阶段来说，这对于对有效的构成有效的构成和恢复的情况至关重要，通常是具有挑战性的。在本文中，我们提出了AdvMind，这是一种新的估计模型，这些模型以强大而及时的方式推断了黑盒对手攻击的对手意图。具体而言，为了实现强大的检测，AdvMind说明了对手适应性的，以便她掩盖目标的尝试将大大增加攻击成本（例如，就查询数量而言）；为了迅速检测，AdverMind主动合成了合理的查询结果，以从对手最大地暴露其意图的对手中征求后续查询。通过对基准数据集和最先进的黑盒攻击进行广泛的经验评估，我们证明，在观察到少于3个查询批次之后，平均而言，Advermind以超过75％的精度检测到对手的意图，同时将适应性攻击的成本提高了60％以上。我们进一步讨论了针对黑盒对抗攻击的Advmind和其他防御方法之间可能的协同作用，指出了几个有前途的研究方向。

Deep neural networks (DNNs) are inherently susceptible to adversarial attacks even under black-box settings, in which the adversary only has query access to the target models. In practice, while it may be possible to effectively detect such attacks (e.g., observing massive similar but non-identical queries), it is often challenging to exactly infer the adversary intent (e.g., the target class of the adversarial example the adversary attempts to craft) especially during early stages of the attacks, which is crucial for performing effective deterrence and remediation of the threats in many scenarios. In this paper, we present AdvMind, a new class of estimation models that infer the adversary intent of black-box adversarial attacks in a robust and prompt manner. Specifically, to achieve robust detection, AdvMind accounts for the adversary adaptiveness such that her attempt to conceal the target will significantly increase the attack cost (e.g., in terms of the number of queries); to achieve prompt detection, AdvMind proactively synthesizes plausible query results to solicit subsequent queries from the adversary that maximally expose her intent. Through extensive empirical evaluation on benchmark datasets and state-of-the-art black-box attacks, we demonstrate that on average AdvMind detects the adversary intent with over 75% accuracy after observing less than 3 query batches and meanwhile increases the cost of adaptive attacks by over 60%. We further discuss the possible synergy between AdvMind and other defense methods against black-box adversarial attacks, pointing to several promising research directions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题