论文标题
评估蒙特卡洛的对抗性鲁棒性和深贝叶斯神经网络分类的蒸馏方法
Assessing the Adversarial Robustness of Monte Carlo and Distillation Methods for Deep Bayesian Neural Network Classification
论文作者
论文摘要
在本文中,我们考虑了在马尔可夫链蒙特卡洛(MCMC)和贝叶斯黑暗知识(BDK)推论近似值下评估深神经网络模型的对抗鲁棒性的问题。我们将每种方法的鲁棒性表征为两种类型的对抗攻击:快速梯度标志方法(FGSM)和投影梯度下降(PGD)。我们表明,基于MCMC的完整推理具有出色的鲁棒性,大大优于基于标准估计的学习。另一方面,BDK提供了边际改进。作为另一个贡献,我们提出了一种使用FGSM和PGD攻击的大型蒙特卡洛集团计算对抗性示例的存储效率方法。
In this paper, we consider the problem of assessing the adversarial robustness of deep neural network models under both Markov chain Monte Carlo (MCMC) and Bayesian Dark Knowledge (BDK) inference approximations. We characterize the robustness of each method to two types of adversarial attacks: the fast gradient sign method (FGSM) and projected gradient descent (PGD). We show that full MCMC-based inference has excellent robustness, significantly outperforming standard point estimation-based learning. On the other hand, BDK provides marginal improvements. As an additional contribution, we present a storage-efficient approach to computing adversarial examples for large Monte Carlo ensembles using both the FGSM and PGD attacks.