论文标题
DEGAN:使用生成对抗网络鉴别器和密度估计的时间序列异常检测
DEGAN: Time Series Anomaly Detection using Generative Adversarial Network Discriminators and Density Estimation
论文作者
论文摘要
开发有效的时间序列异常检测技术对于维持服务质量并提供早期警报很重要。生成的神经网络方法是近年来引起人们越来越关注的一类无监督方法。在本文中,我们提出了一个无监督的生成对抗网络(GAN)基于异常检测框架Degan。它仅依靠正常的时间序列数据作为输入,将良好的配置歧视器(D)训练成独立的异常预测因子。在此框架中,时间序列数据通过滑动窗口方法处理。利用数据中的预期正常模式来开发能够生成正常数据模式的发生器(G)。正常数据也用于超参数调整和D模型选择步骤。然后提取经过验证的D模型并应用以评估看不见的(测试)时间序列,并识别具有异常特征的模式。内核密度估计(KDE)应用于可能异常的数据点,以在测试时间序列上产生概率密度函数。将相对概率最高的段视为异常。为了评估性能,我们在单变量加速度序列上测试了I级铁轨五英里。我们实施了框架,以检测操作员确定的实际异常观察结果。结果表明,使用CNN D体系结构利用框架的平均最佳召回率和精度分别为80%和86%,这表明训练有素的独立D模型具有可靠的异常检测器。此外,还研究了GAN超参数,GAN架构,滑动窗口大小,时间序列的聚类以及具有标记/未标记数据的模型验证的影响。
Developing efficient time series anomaly detection techniques is important to maintain service quality and provide early alarms. Generative neural network methods are one class of the unsupervised approaches that are achieving increasing attention in recent years. In this paper, we have proposed an unsupervised Generative Adversarial Network (GAN)-based anomaly detection framework, DEGAN. It relies solely on normal time series data as input to train a well-configured discriminator (D) into a standalone anomaly predictor. In this framework, time series data is processed by the sliding window method. Expected normal patterns in data are leveraged to develop a generator (G) capable of generating normal data patterns. Normal data is also utilized in hyperparameter tuning and D model selection steps. Validated D models are then extracted and applied to evaluate unseen (testing) time series and identify patterns that have anomalous characteristics. Kernel density estimation (KDE) is applied to data points that are likely to be anomalous to generate probability density functions on the testing time series. The segments with the highest relative probabilities are detected as anomalies. To evaluate the performance, we tested on univariate acceleration time series for five miles of a Class I railroad track. We implemented the framework to detect the real anomalous observations identified by operators. The results show that leveraging the framework with a CNN D architecture results in average best recall and precision of 80% and 86%, respectively, which demonstrates that a well-trained standalone D model has the potential to be a reliable anomaly detector. Moreover, the influence of GAN hyperparameters, GAN architectures, sliding window sizes, clustering of time series, and model validation with labeled/unlabeled data were also investigated.