批次：在层的正交培训上，$ \ ell_2 $认证的鲁棒性

论文标题

批次：在层的正交培训上，$ \ ell_2 $认证的鲁棒性

LOT: Layer-wise Orthogonal Training on Improving $\ell_2$ Certified Robustness

论文作者

Xu, Xiaojun, Li, Linyi, Li, Bo

论文摘要

最近的研究表明，具有Lipschitz约束的训练深神网络（DNN）能够增强对抗性鲁棒性和其他模型属性，例如稳定性。在本文中，我们提出了一种层正交训练方法（LOT），以通过用不受约束的矩阵参数为正交矩阵进行有效训练1-Lipschitz卷积层。然后，我们通过将输入域转换为傅立叶频域来有效计算卷积内核的反平方根。另一方面，由于现有作品表明，半监督培训有助于提高经验鲁棒性，因此我们旨在弥合差距，并证明半监督的学习还改善了Lipschitz带有的模型的认证鲁棒性。我们在不同的环境下对批次进行全面评估。我们表明，关于确定性L2认证的鲁棒性，并且对更深层次的神经网络的规模明显优于基准。在监督的情况下，我们将所有体系结构的最新认证鲁棒性提高（例如，CIFAR-10的59.04％从59.04％到63.50％，从RADIUS RHO = 40层网络的CIFAR-100上的CIFAR-100上的32.57％到34.59％）。通过对未标记数据的半监督学习，我们能够在RHO = 108/255上提高CIFAR-10的最新认证鲁棒性，从36.04％到42.39％。此外，Load始终在只有1/3评估时间的不同模型体系结构上胜过基线。

Recent studies show that training deep neural networks (DNNs) with Lipschitz constraints are able to enhance adversarial robustness and other model properties such as stability. In this paper, we propose a layer-wise orthogonal training method (LOT) to effectively train 1-Lipschitz convolution layers via parametrizing an orthogonal matrix with an unconstrained matrix. We then efficiently compute the inverse square root of a convolution kernel by transforming the input domain to the Fourier frequency domain. On the other hand, as existing works show that semi-supervised training helps improve empirical robustness, we aim to bridge the gap and prove that semi-supervised learning also improves the certified robustness of Lipschitz-bounded models. We conduct comprehensive evaluations for LOT under different settings. We show that LOT significantly outperforms baselines regarding deterministic l2 certified robustness, and scales to deeper neural networks. Under the supervised scenario, we improve the state-of-the-art certified robustness for all architectures (e.g. from 59.04% to 63.50% on CIFAR-10 and from 32.57% to 34.59% on CIFAR-100 at radius rho = 36/255 for 40-layer networks). With semi-supervised learning over unlabelled data, we are able to improve state-of-the-art certified robustness on CIFAR-10 at rho = 108/255 from 36.04% to 42.39%. In addition, LOT consistently outperforms baselines on different model architectures with only 1/3 evaluation time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题