你怎么做？使用伪宇宙的细粒度动作理解

论文标题

你怎么做？使用伪宇宙的细粒度动作理解

How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs

论文作者

Doughty, Hazel, Snoek, Cees G. M.

论文摘要

我们旨在了解行动的执行方式并确定细微的差异，例如“折叠”与“轻轻折叠”。为此，我们提出了一种识别跨不同动作的副词的方法。但是，这种细粒度的注释难以获得，它们的长尾巴性质使得在罕见的动作倡导者组成中识别副词是具有挑战性的。因此，我们的方法将半监督的学习与多个副词伪标签一起使用，以仅使用动作标签来利用视频。结合这些伪宇宙的自适应阈值，我们能够有效利用可用数据，同时解决长尾分布。此外，我们收集了三个现有视频检索数据集的副词注释，这使我们能够介绍在看不见的Action-Adverb组成和看不见的域中识别副词的新任务。实验证明了我们的方法的有效性，该方法在识别副词和适应副词识别的半监督作品方面的表现优于先前的工作。我们还展示了副词如何关联细粒度的动作。

We aim to understand how actions are performed and identify subtle differences, such as 'fold firmly' vs. 'fold gently'. To this end, we propose a method which recognizes adverbs across different actions. However, such fine-grained annotations are difficult to obtain and their long-tailed nature makes it challenging to recognize adverbs in rare action-adverb compositions. Our approach therefore uses semi-supervised learning with multiple adverb pseudo-labels to leverage videos with only action labels. Combined with adaptive thresholding of these pseudo-adverbs we are able to make efficient use of the available data while tackling the long-tailed distribution. Additionally, we gather adverb annotations for three existing video retrieval datasets, which allows us to introduce the new tasks of recognizing adverbs in unseen action-adverb compositions and unseen domains. Experiments demonstrate the effectiveness of our method, which outperforms prior work in recognizing adverbs and semi-supervised works adapted for adverb recognition. We also show how adverbs can relate fine-grained actions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题