论文标题
神经网络的强大分布检测
Robust Out-of-distribution Detection for Neural Networks
论文作者
论文摘要
检测到分布(OOD)输入对于在现实世界中安全部署深度学习模型至关重要。在评估良性分布和OOD样品时,现有的检测OOD示例的方法很好地工作。但是,在本文中,我们表明,在评估分布和OOD输入的情况下,现有的检测机制可能会非常脆弱,而对对抗性的扰动则不会改变其语义。正式地,我们广泛研究了在常见的OOD检测方法上鲁棒分布检测的问题,并表明可以通过在分布和OOD输入中添加小的扰动来轻易地愚弄最先进的OOD检测器。为了抵消这些威胁,我们提出了一种称为芦荟的有效算法,该算法通过将模型暴露于对抗性较大和异常示例的范围来执行强大的训练。我们的方法可以灵活地结合在一起,并使现有方法鲁棒。在通用基准数据集上,我们表明芦荟可显着提高最先进的OOD检测的鲁棒性,而CIFAR-10的AUROC为58.4%,CIFAR-100提高了46.59%。
Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in the real world. Existing approaches for detecting OOD examples work well when evaluated on benign in-distribution and OOD samples. However, in this paper, we show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs with minimal adversarial perturbations which don't change their semantics. Formally, we extensively study the problem of Robust Out-of-Distribution Detection on common OOD detection approaches, and show that state-of-the-art OOD detectors can be easily fooled by adding small perturbations to the in-distribution and OOD inputs. To counteract these threats, we propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples. Our method can be flexibly combined with, and render existing methods robust. On common benchmark datasets, we show that ALOE substantially improves the robustness of state-of-the-art OOD detection, with 58.4% AUROC improvement on CIFAR-10 and 46.59% improvement on CIFAR-100.