论文标题
概率异常检测和产生
Probabilistic Outlier Detection and Generation
论文作者
论文摘要
通过将数据提升到概率分布的空间中,这是一种用于异常检测和生成的新方法,该方法在概率分布的空间中,这些分布在分析上表达不可表达,但是可以从中使用神经发电机从中绘制样品。鉴于未知的潜在嵌入式和离群分布的混合物,Wasserstein Double AutoCoder用于检测和生成嵌入式和异常值。所提出的方法称为Waldo(Wasserstein AutoCoder用于学习异常值的分布),在包括MNIST,CIFAR10和KDD99在内的经典数据集中进行了评估,以进行检测准确性和鲁棒性。我们给出了一个对真实零售数据集的离群值检测的示例,以及一个用于模拟入侵攻击的异常生成的示例。但是,我们预见到可以使用Waldo的许多应用程序方案。据我们所知,这是研究离群检测和共同产生的第一项工作。
A new method for outlier detection and generation is introduced by lifting data into the space of probability distributions which are not analytically expressible, but from which samples can be drawn using a neural generator. Given a mixture of unknown latent inlier and outlier distributions, a Wasserstein double autoencoder is used to both detect and generate inliers and outliers. The proposed method, named WALDO (Wasserstein Autoencoder for Learning the Distribution of Outliers), is evaluated on classical data sets including MNIST, CIFAR10 and KDD99 for detection accuracy and robustness. We give an example of outlier detection on a real retail sales data set and an example of outlier generation for simulating intrusion attacks. However we foresee many application scenarios where WALDO can be used. To the best of our knowledge this is the first work that studies both outlier detection and generation together.