梯度通过一些查询对齐攻击

论文标题

梯度通过一些查询对齐攻击

Gradient Aligned Attacks via a Few Queries

论文作者

Yang, Xiangyuan, Lin, Jie, Zhang, Hanlin, Yang, Xinyu, Zhao, Peng

论文摘要

事实证明，仅依赖受害者模型的输出的Black-Box查询攻击在攻击深度学习模型方面有效。但是，在新的情况下，现有的黑框查询攻击表现出较低的性能，在新的情况下，只允许几个查询。为了解决这个问题，我们提出了梯度对齐攻击（GAA），该攻击使用我们在代理模型上设计的梯度对齐损失（GAL）来估计准确的梯度以提高受害者模型的攻击性能。具体而言，我们提出了一种梯度对准机制，以确保相对于logit矢量的损耗函数的衍生物具有与替代模型和受害者模型之间相同的重量系数。使用这种机制，我们将跨凝性（CE）损失和边缘损失转换为梯度对准形式，即梯度对齐的CE或边缘损失。这些损失不仅提高了我们梯度的攻击性能在新颖的情况下对齐攻击的攻击，而且提高了现有的Black-Box查询攻击的查询效率。通过对Imagenet数据库的理论和经验分析，我们证明了我们的梯度对齐机制是有效的，并且我们的梯度对齐攻击可以在新颖方案中提高16.1 \％\％\％\％和31.3 \％的攻击性能，而$ l_2 $ and $ l_2 $ and $ l_ {\ f _ {\ infty} $ normable of The Boxable of Boxaints的攻击措施相比，相比之下。此外，梯度对准损失还显着减少了这些可转让的基于先前的查询攻击所需的查询数量的最大系数2.9倍。总体而言，我们提议的梯度对齐攻击和损失显示了攻击性能和黑盒查询攻击的查询效率的显着提高，尤其是在只允许几个查询的情况下。

Black-box query attacks, which rely only on the output of the victim model, have proven to be effective in attacking deep learning models. However, existing black-box query attacks show low performance in a novel scenario where only a few queries are allowed. To address this issue, we propose gradient aligned attacks (GAA), which use the gradient aligned losses (GAL) we designed on the surrogate model to estimate the accurate gradient to improve the attack performance on the victim model. Specifically, we propose a gradient aligned mechanism to ensure that the derivatives of the loss function with respect to the logit vector have the same weight coefficients between the surrogate and victim models. Using this mechanism, we transform the cross-entropy (CE) loss and margin loss into gradient aligned forms, i.e. the gradient aligned CE or margin losses. These losses not only improve the attack performance of our gradient aligned attacks in the novel scenario but also increase the query efficiency of existing black-box query attacks. Through theoretical and empirical analysis on the ImageNet database, we demonstrate that our gradient aligned mechanism is effective, and that our gradient aligned attacks can improve the attack performance in the novel scenario by 16.1\% and 31.3\% on the $l_2$ and $l_{\infty}$ norms of the box constraint, respectively, compared to four latest transferable prior-based query attacks. Additionally, the gradient aligned losses also significantly reduce the number of queries required in these transferable prior-based query attacks by a maximum factor of 2.9 times. Overall, our proposed gradient aligned attacks and losses show significant improvements in the attack performance and query efficiency of black-box query attacks, particularly in scenarios where only a few queries are allowed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题