发布有效的驻越设备模型增加了对抗性脆弱性

论文标题

发布有效的驻越设备模型增加了对抗性脆弱性

Publishing Efficient On-device Models Increases Adversarial Vulnerability

论文作者

Hong, Sanghyun, Carlini, Nicholas, Kurakin, Alexey

论文摘要

深度神经网络（DNN）的计算需求的最新增加引发了人们对有效的深度学习机制（例如量化或修剪）的兴趣。这些机制可以以可比的精度构建商业规模的小型模型的小型版本，从而加速了它们对资源受限设备的部署。在本文中，我们研究了发布大型模型的设备变体的安全考虑因素。我们首先表明，对手可以利用设备模型，使攻击大型模型更加容易。在跨19个DNN的评估中，通过利用已发表的设备模型作为先验的转移，原始商业规模模型的对抗脆弱性增加了100倍。然后，我们表明脆弱性随着全尺度和其有效模型增加之间的相似性而增加。根据见解，我们提出了一种防御，$相似性$ - $ untairing $，即良好的设备模型，目的是降低相似性。我们评估了所有19个DNS的防御，发现它的可转移性最高为90％，并且要求的查询数量为10-100倍。我们的结果表明，需要进一步的研究，以发表这些有效的兄弟姐妹造成的安全性（甚至隐私）威胁。

Recent increases in the computational demands of deep neural networks (DNNs) have sparked interest in efficient deep learning mechanisms, e.g., quantization or pruning. These mechanisms enable the construction of a small, efficient version of commercial-scale models with comparable accuracy, accelerating their deployment to resource-constrained devices. In this paper, we study the security considerations of publishing on-device variants of large-scale models. We first show that an adversary can exploit on-device models to make attacking the large models easier. In evaluations across 19 DNNs, by exploiting the published on-device models as a transfer prior, the adversarial vulnerability of the original commercial-scale models increases by up to 100x. We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase. Based on the insights, we propose a defense, $similarity$-$unpairing$, that fine-tunes on-device models with the objective of reducing the similarity. We evaluated our defense on all the 19 DNNs and found that it reduces the transferability up to 90% and the number of queries required by a factor of 10-100x. Our results suggest that further research is needed on the security (or even privacy) threats caused by publishing those efficient siblings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题