将在线以自我为中心的行动识别到野外

论文标题

将在线以自我为中心的行动识别到野外

Bringing Online Egocentric Action Recognition into the wild

论文作者

Goletto, Gabriele, Planamente, Mirco, Caputo, Barbara, Averta, Giuseppe

论文摘要

为了实现安全有效的人类机器人合作，开发模型以识别人类活动至关重要。以自我为中心的视觉似乎是解决这个问题的可行解决方案，因此，许多作品提供了深入的学习解决方案，从第一人称视频中推断人类行为。但是，尽管非常有前途，但其中大多数并不考虑现实部署带来的主要挑战，例如模型的可移植性，对实时推理的需求以及对新领域（即新的空间，用户，任务）的鲁棒性。在本文的情况下，我们设定了以自我为中心的视觉模型来考虑现实应用的界限，定义了野外以自我为中心的动作识别的新颖设置，该设置鼓励研究人员开发新颖的，应用于感知的解决方案。我们还提出了一种新的模型不足的技术，该技术可以在这种新环境中快速重新使用现有体系结构，以证明在微型设备（Jetson Nano）上部署模型的可行性（Jetson Nano）（Jetson Nano），并以非常低的能量消耗直接在边缘执行任务（平均为50 FPS）。该代码可公开可用：https：//github.com/egocentricvision/egowild。

To enable a safe and effective human-robot cooperation, it is crucial to develop models for the identification of human activities. Egocentric vision seems to be a viable solution to solve this problem, and therefore many works provide deep learning solutions to infer human actions from first person videos. However, although very promising, most of these do not consider the major challenges that comes with a realistic deployment, such as the portability of the model, the need for real-time inference, and the robustness with respect to the novel domains (i.e., new spaces, users, tasks). With this paper, we set the boundaries that egocentric vision models should consider for realistic applications, defining a novel setting of egocentric action recognition in the wild, which encourages researchers to develop novel, applications-aware solutions. We also present a new model-agnostic technique that enables the rapid repurposing of existing architectures in this new context, demonstrating the feasibility to deploy a model on a tiny device (Jetson Nano) and to perform the task directly on the edge with very low energy consumption (2.4W on average at 50 fps). The code is publicly available at: https://github.com/EgocentricVision/EgoWild.

下载PDF全文

下载文献需遵守相关版权规定

论文标题