论文标题
可学习视频场景检测的最佳顺序分组
Learnable Optimal Sequential Grouping for Video Scene Detection
论文作者
论文摘要
视频场景检测是将视频分为时间语义章节的任务。在尝试分析异质视频内容之前,这是重要的初步步骤。最近,提出了最佳的顺序分组(OSG)作为强大的无监督解决方案,以解决视频场景检测问题的公式。在这项工作中,我们将OSG的功能扩展到了学习制度。通过赋予从示例中学习并利用强大的优化公式的能力,我们可以提高性能并增强技术的多功能性。我们提出了将OSG纳入各种配置下深度学习神经网络中的全面分析。这些配置包括以直接的方式学习嵌入,旨在指导OSG解决方案的量身定制损失以及通过OSG管道进行学习的集成模型。通过彻底的评估和分析,我们评估了各种配置的益处和行为,并表明我们可学习的OSG方法表现出可取的行为和与艺术状态相比的性能增强。
Video scene detection is the task of dividing videos into temporal semantic chapters. This is an important preliminary step before attempting to analyze heterogeneous video content. Recently, Optimal Sequential Grouping (OSG) was proposed as a powerful unsupervised solution to solve a formulation of the video scene detection problem. In this work, we extend the capabilities of OSG to the learning regime. By giving the capability to both learn from examples and leverage a robust optimization formulation, we can boost performance and enhance the versatility of the technology. We present a comprehensive analysis of incorporating OSG into deep learning neural networks under various configurations. These configurations include learning an embedding in a straight-forward manner, a tailored loss designed to guide the solution of OSG, and an integrated model where the learning is performed through the OSG pipeline. With thorough evaluation and analysis, we assess the benefits and behavior of the various configurations, and show that our learnable OSG approach exhibits desirable behavior and enhanced performance compared to the state of the art.