论文标题
如何找到独角兽:时间序列的新型无模型,无监督的异常检测方法
How to find a unicorn: a novel model-free, unsupervised anomaly detection method for time series
论文作者
论文摘要
在许多科学和工业领域,对异常事件的认识是一项具有挑战性但至关重要的任务,尤其是当异常的特性未知时。在本文中,我们引入了一种新的异常概念,称为“独角兽”或独特的事件,并提出了一种新的,无模型的,无监督的检测算法来检测独角兽。新算法的关键组成部分是时间离群因子(TOF),用于测量来自动态系统连续数据集中事件的唯一性。在许多方面,独特事件的概念与传统异常值有很大不同:虽然重复的异常值不再是独特的事件,但独特的事件不一定是异常值。它不一定会从正常活动的分布中脱颖而出。在识别不同类型的模拟数据集上的独特事件时,对我们的算法的性能进行了研究,并将其与局部异常值因子(LOF)和Discord Discovery算法进行了比较。与LOF和DISORD算法相比,TOF的性能也出色,即使在识别传统的异常值中,也可以认识到独特的事件,这些事件没有。 Unicorn概念和新检测方法的好处通过来自非常不同的科学领域的示例数据集说明。在已经知道的情况下,我们的算法成功地识别了独特的事件,例如Ligo检测器数据上的二进制黑洞合并的重力波和ECG数据系列中呼吸衰竭的迹象。此外,在过去30年的LIBOR数据集中发现了独特的事件。
Recognition of anomalous events is a challenging but critical task in many scientific and industrial fields, especially when the properties of anomalies are unknown. In this paper, we introduce a new anomaly concept called "unicorn" or unique event and present a new, model-free, unsupervised detection algorithm to detect unicorns. The key component of the new algorithm is the Temporal Outlier Factor (TOF) to measure the uniqueness of events in continuous data sets from dynamic systems. The concept of unique events differs significantly from traditional outliers in many aspects: while repetitive outliers are no longer unique events, a unique event is not necessarily an outlier; it does not necessarily fall out from the distribution of normal activity. The performance of our algorithm was examined in recognizing unique events on different types of simulated data sets with anomalies and it was compared with the Local Outlier Factor (LOF) and discord discovery algorithms. TOF had superior performance compared to LOF and discord algorithms even in recognizing traditional outliers and it also recognized unique events that those did not. The benefits of the unicorn concept and the new detection method were illustrated by example data sets from very different scientific fields. Our algorithm successfully recognized unique events in those cases where they were already known such as the gravitational waves of a binary black hole merger on LIGO detector data and the signs of respiratory failure on ECG data series. Furthermore, unique events were found on the LIBOR data set of the last 30 years.