论文标题
动态采样非局部梯度,以实现更强的对抗攻击
Dynamically Sampled Nonlocal Gradients for Stronger Adversarial Attacks
论文作者
论文摘要
深度神经网络对小型甚至不可感知的扰动的脆弱性已成为深度学习研究的核心话题。尽管已经引入了几种复杂的防御机制,但后来证明大多数是无效的。但是,对模型鲁棒性的可靠评估是在安全至关重要方案中部署的必要条件。为了克服这个问题,我们建议对最先进的一阶对抗攻击的梯度计算进行简单而有效的修改。通常,为给定数据点直接计算攻击的梯度更新。这种方法对损失函数的噪声和局部最佳敏感。受到非凸优化的梯度采样技术的启发,我们提出了动态采样的非局部梯度下降(DSNGD)。 DSNGD计算对抗攻击的梯度方向,因为优化历史的过去梯度的加权平均值。此外,在优化方案中会自动学习定义采样操作的分布超参数。我们从经验上表明,通过纳入此非本地梯度信息,我们能够对嘈杂和非凸损损失表面的全球下降方向进行更准确的估计。此外,我们表明基于DSNGD的攻击平均快35%,而与基于梯度下降的同行相比,成功率更高0.9%至27.1%。
The vulnerability of deep neural networks to small and even imperceptible perturbations has become a central topic in deep learning research. Although several sophisticated defense mechanisms have been introduced, most were later shown to be ineffective. However, a reliable evaluation of model robustness is mandatory for deployment in safety-critical scenarios. To overcome this problem we propose a simple yet effective modification to the gradient calculation of state-of-the-art first-order adversarial attacks. Normally, the gradient update of an attack is directly calculated for the given data point. This approach is sensitive to noise and small local optima of the loss function. Inspired by gradient sampling techniques from non-convex optimization, we propose Dynamically Sampled Nonlocal Gradient Descent (DSNGD). DSNGD calculates the gradient direction of the adversarial attack as the weighted average over past gradients of the optimization history. Moreover, distribution hyperparameters that define the sampling operation are automatically learned during the optimization scheme. We empirically show that by incorporating this nonlocal gradient information, we are able to give a more accurate estimation of the global descent direction on noisy and non-convex loss surfaces. In addition, we show that DSNGD-based attacks are on average 35% faster while achieving 0.9% to 27.1% higher success rates compared to their gradient descent-based counterparts.