论文标题

用于检测高密度异常的算法框架

Algorithmic Frameworks for the Detection of High Density Anomalies

论文作者

Foorthuis, Ralph

论文摘要

这项研究探讨了高密度异常的概念。与传统的异常概念(作为孤立事件)相反,高密度异常是位于数据空间最正常区域的异常病例。这样的异常与各种实际用例有关,例如行为不当检测和数据质量分析。在分析非常大或嘈杂的集合时,识别它们的有效方法尤其重要,传统的异常检测算法将返回许多误报。为了能够识别高密度异常,本研究引入了几种非参数算法框架,以进行无监督的检测。这些框架能够利用现有的基础异常检测算法,并为此检测任务中固有的平衡问题提供不同的解决方案。使用合成数据集评估框架,并将其与现有的基线算法进行比较,以检测传统异常。迭代部分推动(IPP)框架证明可以得出最佳的检测结果。

This study explores the concept of high-density anomalies. As opposed to the traditional concept of anomalies as isolated occurrences, high-density anomalies are deviant cases positioned in the most normal regions of the data space. Such anomalies are relevant for various practical use cases, such as misbehavior detection and data quality analysis. Effective methods for identifying them are particularly important when analyzing very large or noisy sets, for which traditional anomaly detection algorithms will return many false positives. In order to be able to identify high-density anomalies, this study introduces several non-parametric algorithmic frameworks for unsupervised detection. These frameworks are able to leverage existing underlying anomaly detection algorithms and offer different solutions for the balancing problem inherent in this detection task. The frameworks are evaluated with both synthetic and real-world datasets, and are compared with existing baseline algorithms for detecting traditional anomalies. The Iterative Partial Push (IPP) framework proves to yield the best detection results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源