论文标题
corrattack:黑盒对抗攻击,结构化搜索
CorrAttack: Black-box Adversarial Attack with Structured Search
论文作者
论文摘要
我们提出了一种基于得分的对抗性攻击的新方法,攻击者在该方法中查询目标模型的损失轨迹。我们的方法采用了一个参数化的搜索空间,该搜索空间具有捕获损失函数梯度关系的结构。我们表明,在结构化空间上进行搜索可以通过随时间变化的上下文匪徒问题近似,在这种情况下,攻击者采用相关臂的特征来对输入进行修改,并随着减少损失函数的降低而获得立即奖励。然后可以通过贝叶斯优化过程解决时间变化的上下文匪徒问题,该过程可以利用结构化动作空间的特征。 ImageNet和Google Cloud Vision API的实验表明,所提出的方法达到了不安全和辩护模型的最先进的成功率和查询效率。
We present a new method for score-based adversarial attack, where the attacker queries the loss-oracle of the target model. Our method employs a parameterized search space with a structure that captures the relationship of the gradient of the loss function. We show that searching over the structured space can be approximated by a time-varying contextual bandits problem, where the attacker takes feature of the associated arm to make modifications of the input, and receives an immediate reward as the reduction of the loss function. The time-varying contextual bandits problem can then be solved by a Bayesian optimization procedure, which can take advantage of the features of the structured action space. The experiments on ImageNet and the Google Cloud Vision API demonstrate that the proposed method achieves the state of the art success rates and query efficiencies for both undefended and defended models.