一种自我监督的方法

论文标题

一种自我监督的方法

A Self-supervised Approach for Adversarial Robustness

论文作者

Naseer, Muzammal, Khan, Salman, Hayat, Munawar, Khan, Fahad Shahbaz, Porikli, Fatih

论文摘要

对抗性示例可能会导致基于深神经网络（DNN）视力系统的灾难性错误，例如用于分类，分割和对象检测。 DNN对此类攻击的脆弱性可以证明对其现实世界的部署有重大障碍。对抗性示例的可转让性要求可提供可以提供交叉任务保护的防御能力。通过修改目标模型参数来增强鲁棒性的对抗性训练缺乏这种普遍性。另一方面，面对不断发展的攻击，基于不同输入的防御措施不足。在本文中，我们迈出了第一步，将两种方法的好处结合在一起，并在输入空间中提出一种自我监督的对抗训练机制。根据设计，我们的防御方法是一种可推广的方法，并且对\ textbf {Uney}对抗攻击（例如，通过将转换不变的成功率\ textbf {Ensemblem}攻击从82.6 \％\％降低至31.9 \％与以前的状态相比，对\ textbf {textbf {textbf {textbf {textbf {textbf {\ eg \ eg）提供了明显的鲁棒性。可以将其作为插件解决方案部署，以保护各种视觉系统，因为我们为分类，细分和检测而言证明了这一点。代码可在：{\ small \ url {https://github.com/muzammal-naseer/nrp}}}中获得。

Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems e.g., for classification, segmentation and object detection. The vulnerability of DNNs against such attacks can prove a major roadblock towards their real-world deployment. Transferability of adversarial examples demand generalizable defenses that can provide cross-task protection. Adversarial training that enhances robustness by modifying target model's parameters lacks such generalizability. On the other hand, different input processing based defenses fall short in the face of continuously evolving attacks. In this paper, we take the first step to combine the benefits of both approaches and propose a self-supervised adversarial training mechanism in the input space. By design, our defense is a generalizable approach and provides significant robustness against the \textbf{unseen} adversarial attacks (\eg by reducing the success rate of translation-invariant \textbf{ensemble} attack from 82.6\% to 31.9\% in comparison to previous state-of-the-art). It can be deployed as a plug-and-play solution to protect a variety of vision systems, as we demonstrate for the case of classification, segmentation and detection. Code is available at: {\small\url{https://github.com/Muzammal-Naseer/NRP}}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题