探索域增量视频以LiveFood基准突出显示检测

论文标题

探索域增量视频以LiveFood基准突出显示检测

Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark

论文作者

Pei, Sen, Xu, Shixiong, Jin, Xiaojie

论文摘要

视频突出显示检测（VHD）是计算机视觉中的一个主动研究领域，旨在找到给定原始视频输入的最吸引人的剪辑。但是，大多数VHD方法基于封闭的世界假设，即，预先定义了固定数量的高光类别，并且所有培训数据都可以事先获得。因此，对于增加高光域和培训数据，现有方法的可伸缩性较差。为了解决上述问题，我们提出了一个新颖的视频突出显示的检测方法，名为“全局原型编码”（GPE），以通过参数化的原型逐步学习以适应新域。为了促进这个新的研究方向，我们收集了一个被称为LiveFood的精细注释的数据集，其中包括5,100多个现场美食视频，包括四个领域：食材，烹饪，演示和饮食。据我们所知，这是第一项探索视频突出显示在增量学习环境中的检测的工作，开辟了新的土地，以将VHD应用于实际场景，其中相关的突出显示域和随着时间的推移会增加培训数据。我们通过广泛的实验证明了GPE的有效性。值得注意的是，GPE超过了PiveFood上的流行领域增量学习方法，从而在所有领域上实现了显着的地图改进。关于经典数据集，GPE也与以前的艺术相当。该代码可在以下网址提供：https：//github.com/foreverps/incrementalvhd_gpe。

Video highlights detection (VHD) is an active research field in computer vision, aiming to locate the most user-appealing clips given raw video inputs. However, most VHD methods are based on the closed world assumption, i.e., a fixed number of highlight categories is defined in advance and all training data are available beforehand. Consequently, existing methods have poor scalability with respect to increasing highlight domains and training data. To address above issues, we propose a novel video highlights detection method named Global Prototype Encoding (GPE) to learn incrementally for adapting to new domains via parameterized prototypes. To facilitate this new research direction, we collect a finely annotated dataset termed LiveFood, including over 5,100 live gourmet videos that consist of four domains: ingredients, cooking, presentation, and eating. To the best of our knowledge, this is the first work to explore video highlights detection in the incremental learning setting, opening up new land to apply VHD for practical scenarios where both the concerned highlight domains and training data increase over time. We demonstrate the effectiveness of GPE through extensive experiments. Notably, GPE surpasses popular domain incremental learning methods on LiveFood, achieving significant mAP improvements on all domains. Concerning the classic datasets, GPE also yields comparable performance as previous arts. The code is available at: https://github.com/ForeverPs/IncrementalVHD_GPE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题