MOCCA：多层单级分类用于异常检测

论文标题

MOCCA：多层单级分类用于异常检测

MOCCA: Multi-Layer One-Class ClassificAtion for Anomaly Detection

论文作者

Massoli, Fabio Valerio, Falchi, Fabrizio, Kantarci, Alperen, Akti, Şeymanur, Ekenel, Hazim Kemal, Amato, Giuseppe

论文摘要

异常在所有科学领域都无处不在，并且由于对数据分布的不完整知识或突然发挥作用并扭曲观察结果的未知过程的知识不完整，因此可能表达出意外的事件。由于此类事件的罕见性，要培训有关异常检测（AD）任务的深度学习模型，科学家仅依靠“正常”数据，即非异常样本。因此，让神经网络推断输入数据下方的分布。在这种情况下，我们提出了一个新颖的框架，称为多层单级分类（MOCCA），以训练和测试广告任务的深度学习模型。具体来说，我们将其应用于自动编码器。我们工作中的主要新颖性源于对广告任务的中间表示的明确优化。实际上，与将神经网络视为单个计算块的常用方法不同，即仅使用上一层的输出，Mocca明确利用了深度体系结构的多层结构。在训练过程中，为AD优化了每个层的特征空间，而在测试阶段，将从训练的层中提取的深度表示结合在一起以检测异常。使用Mocca，我们将训练过程分为两个步骤。首先，仅针对重建任务对自动编码器进行培训。然后，我们只保留任务为最小化输出表示和参考点（无异常训练数据质心）之间的L_2距离的编码器。随后，我们结合了在编码器模型的各个训练层上提取的深度特征，以在推理时检测异常。为了评估接受MOCCA训练的模型的性能，我们在公开可用的数据集上进行了广泛的实验。我们表明，我们所提出的方法与文献中可用的最新方法相当或出色的表现。

Anomalies are ubiquitous in all scientific fields and can express an unexpected event due to incomplete knowledge about the data distribution or an unknown process that suddenly comes into play and distorts observations. Due to such events' rarity, to train deep learning models on the Anomaly Detection (AD) task, scientists only rely on "normal" data, i.e., non-anomalous samples. Thus, letting the neural network infer the distribution beneath the input data. In such a context, we propose a novel framework, named Multi-layer One-Class ClassificAtion (MOCCA),to train and test deep learning models on the AD task. Specifically, we applied it to autoencoders. A key novelty in our work stems from the explicit optimization of intermediate representations for the AD task. Indeed, differently from commonly used approaches that consider a neural network as a single computational block, i.e., using the output of the last layer only, MOCCA explicitly leverages the multi-layer structure of deep architectures. Each layer's feature space is optimized for AD during training, while in the test phase, the deep representations extracted from the trained layers are combined to detect anomalies. With MOCCA, we split the training process into two steps. First, the autoencoder is trained on the reconstruction task only. Then, we only retain the encoder tasked with minimizing the L_2 distance between the output representation and a reference point, the anomaly-free training data centroid, at each considered layer. Subsequently, we combine the deep features extracted at the various trained layers of the encoder model to detect anomalies at inference time. To assess the performance of the models trained with MOCCA, we conduct extensive experiments on publicly available datasets. We show that our proposed method reaches comparable or superior performance to state-of-the-art approaches available in the literature.

下载PDF全文

下载文献需遵守相关版权规定

论文标题