罚款：用于过程感知行动质量评估的细粒数据集

论文标题

罚款：用于过程感知行动质量评估的细粒数据集

FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment

论文作者

Xu, Jinglin, Rao, Yongming, Yu, Xumin, Chen, Guangyi, Zhou, Jie, Lu, Jiwen

论文摘要

大多数现有的动作质量评估方法都依赖整个视频的深度特征来预测分数，由于非透明推理过程和可解释性差，该分数降低了。我们认为，在竞争体育视频中了解高级语义和动作的内部时间结构都是使预测准确和可解释的关键。为了实现这一目标，我们构建了一个新的细颗粒数据集，称为“罚款”，并在不同的潜水事件上开发了有关动作程序的详细注释。我们还提出了一种通过新的时间分割注意模块学习的操作质量评估方法。具体而言，我们建议将成对查询和示例性动作实例解析为具有不同语义和时间对应关系的连续步骤。提出了该程序感知的跨注意事项，以学习查询和示例步骤之间的嵌入，以发现其语义，空间和时间对应关系，并进一步用于细粒的对比回归，以得出可靠的评分机制。广泛的实验表明，我们的方法对具有更好的解释性的最先进方法实现了实质性改进。该数据集和代码可在\ url {https://github.com/xujinglin/finedivivevivive}上获得。

Most existing action quality assessment methods rely on the deep features of an entire video to predict the score, which is less reliable due to the non-transparent inference process and poor interpretability. We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable. Towards this goal, we construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures. We also propose a procedure-aware approach for action quality assessment, learned by a new Temporal Segmentation Attention module. Specifically, we propose to parse pairwise query and exemplar action instances into consecutive steps with diverse semantic and temporal correspondences. The procedure-aware cross-attention is proposed to learn embeddings between query and exemplar steps to discover their semantic, spatial, and temporal correspondences, and further serve for fine-grained contrastive regression to derive a reliable scoring mechanism. Extensive experiments demonstrate that our approach achieves substantial improvements over state-of-the-art methods with better interpretability. The dataset and code are available at \url{https://github.com/xujinglin/FineDiving}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题