论文标题
视频异常检测通过估计表示的可能性
Video Anomaly Detection by Estimating Likelihood of Representations
论文作者
论文摘要
视频异常检测是一项具有挑战性的任务,这不仅是因为它涉及解决许多子任务,例如运动表示,对象定位和动作识别,而且还因为它通常被认为是涉及检测异常值的无监督学习问题。传统上,解决此任务的解决方案集中在视频框架及其低维功能之间的映射上,同时忽略了这些功能的空间连接。最近的解决方案着重于通过使用硬聚类技术(例如K-均值)或应用神经网络将潜在特征映射到一般理解(例如动作属性)来分析这些空间连接。为了在潜在特征空间中求解视频异常,我们提出了一个深层概率模型,以将此任务转移到密度估计问题中,其中潜在的歧管是由深度降级的自动编码器生成的,并通过预期最大化而聚集。对几个基准测试数据集的评估显示了我们模型的优势,在具有挑战性的数据集上取得了出色的性能。
Video anomaly detection is a challenging task not only because it involves solving many sub-tasks such as motion representation, object localization and action recognition, but also because it is commonly considered as an unsupervised learning problem that involves detecting outliers. Traditionally, solutions to this task have focused on the mapping between video frames and their low-dimensional features, while ignoring the spatial connections of those features. Recent solutions focus on analyzing these spatial connections by using hard clustering techniques, such as K-Means, or applying neural networks to map latent features to a general understanding, such as action attributes. In order to solve video anomaly in the latent feature space, we propose a deep probabilistic model to transfer this task into a density estimation problem where latent manifolds are generated by a deep denoising autoencoder and clustered by expectation maximization. Evaluations on several benchmarks datasets show the strengths of our model, achieving outstanding performance on challenging datasets.