BTECH论文报告有关对抗攻击检测和对抗攻击图像的净化的报告

论文标题

BTECH论文报告有关对抗攻击检测和对抗攻击图像的净化的报告

Btech thesis report on adversarial attack detection and purification of adverserially attacked images

论文作者

Kalaria, Dvij

论文摘要

这是BTECH论文报告，内容涉及对敌方攻击图像的检测和净化。对某些任务（例如分类，回归等）的某些培训示例进行了深入学习模型，通过培训，对权重进行调整，以使该模型不仅在某个指标判断的训练示例上很好地执行了任务，而且还具有出色的能力，可以推广到其他看不见的示例，这些示例通常称为测试数据。尽管机器学习模型在各种任务上取得了巨大的成功，但这些年来，安全性的关注程度较低。沿着各种潜在的网络攻击的鲁棒性也应该是机器学习模型准确性的度量。这些网络攻击可能会导致使用机器学习的现实世界敏感应用，例如医疗和运输系统。因此，必须保护系统免受此类攻击。在本报告中，我专注于一类这些网络攻击，称为对抗性攻击，其中原始输入样本被小型扰动所修改，以使它们在视觉上与人类的视觉效果相同，但是机器学习模型被此类输入所愚弄。在本报告中，我讨论了使用自动编码器对抗对抗攻击的两种新方法，1）通过检测对手的存在以及2）净化这些对手，以使目标分类模型可靠地抵抗此类攻击。

This is Btech thesis report on detection and purification of adverserially attacked images. A deep learning model is trained on certain training examples for various tasks such as classification, regression etc. By training, weights are adjusted such that the model performs the task well not only on training examples judged by a certain metric but has an excellent ability to generalize on other unseen examples as well which are typically called the test data. Despite the huge success of machine learning models on a wide range of tasks, security has received a lot less attention along the years. Robustness along various potential cyber attacks also should be a metric for the accuracy of the machine learning models. These cyber attacks can potentially lead to a variety of negative impacts in the real world sensitive applications for which machine learning is used such as medical and transportation systems. Hence, it is a necessity to secure the system from such attacks. Int this report, I focus on a class of these cyber attacks called the adversarial attacks in which the original input sample is modified by small perturbations such that they still look visually the same to human beings but the machine learning models are fooled by such inputs. In this report I discuss 2 novel ways to counter the adversarial attack using AutoEncoders, 1) by detecting the presence of adversaries and 2) purifying these adversaries to make target classification models robust against such attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题