论文标题

远距离感知的闭塞检测,集中注意力

Distance-Aware Occlusion Detection with Focused Attention

论文作者

Li, Yang, Tu, Yucheng, Chen, Xiaoxue, Zhao, Hao, Zhou, Guyue

论文摘要

对于人类而言,使用视觉信号了解对象之间的关系是直观的。但是,对于人工智能而言,此任务仍然具有挑战性。研究人员在研究语义关系检测方面取得了重大进展,例如人类对象的相互作用检测和视觉关系检测。我们将视觉关系的研究从语义到几何发展迈进了一步。在具体上,我们预测相对遮挡和相对距离关系。但是,从单个图像中检测这些关系具有挑战性。强制集中注意特定于任务的区域在成功检测这些关系方面起着关键作用。在这项工作中,(1)我们提出了一种新颖的三十二次体系结构,作为集中注意力的基础设施。 2)我们使用广义交叉框预测任务有效地指导我们的模型专注于遮挡特异性区域; 3)我们的模型在距离感知关系检测方面实现了新的最新性能。具体而言,我们的模型将距离F1得分从33.8%提高到38.6%,并将闭塞F1得分从34.4%提高到41.2%。我们的代码公开可用。

For humans, understanding the relationships between objects using visual signals is intuitive. For artificial intelligence, however, this task remains challenging. Researchers have made significant progress studying semantic relationship detection, such as human-object interaction detection and visual relationship detection. We take the study of visual relationships a step further from semantic to geometric. In specific, we predict relative occlusion and relative distance relationships. However, detecting these relationships from a single image is challenging. Enforcing focused attention to task-specific regions plays a critical role in successfully detecting these relationships. In this work, (1) we propose a novel three-decoder architecture as the infrastructure for focused attention; 2) we use the generalized intersection box prediction task to effectively guide our model to focus on occlusion-specific regions; 3) our model achieves a new state-of-the-art performance on distance-aware relationship detection. Specifically, our model increases the distance F1-score from 33.8% to 38.6% and boosts the occlusion F1-score from 34.4% to 41.2%. Our code is publicly available.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源