语义到时间：用迭代标签传播的空中视频的半监督分割

论文标题

语义到时间：用迭代标签传播的空中视频的半监督分割

Semantics through Time: Semi-supervised Segmentation of Aerial Videos with Iterative Label Propagation

论文作者

Marcu, Alina, Licaret, Vlad, Costea, Dragos, Leordeanu, Marius

论文摘要

语义细分是机器人导航和安全的关键任务。但是，当前有监督的方法需要大量的像素式注释才能获得准确的结果。标签是一个乏味且耗时的过程，它阻碍了低海拔无人机应用程序的进展。本文通过引入Segprop（一种新型的基于迭代流动的方法），迈出了自动注释的重要一步，该方法与时空中的光谱聚类直接连接，以将语义标签传播到缺乏人类注释的框架。标签进一步用于半监督的学习场景。由于缺乏大型视频空中数据集的激励，我们还介绍了RuralsCapes，这是一个具有高分辨率（4K）图像的新数据集，并且每50帧（最大的同类框架）每50帧，据我们所知。我们的新型SEGPROP自动注释其余的未标记的98％的框架，精度超过90％（F量），显着超过其他最先进的标签传播方法。此外，当将其他方法集成为Segprop迭代标签传播回路内的模块时，我们比基线标签实现了显着的提升。最后，我们在完整的半监督环境中测试了Segprop：我们在Segprop-Autobolot标记的训练框架上训练了几个最先进的深层神经网络，并在完全新颖的视频中对其进行测试。我们每次都令人信服地证明了对监督场景的重大改进。

Semantic segmentation is a crucial task for robot navigation and safety. However, current supervised methods require a large amount of pixelwise annotations to yield accurate results. Labeling is a tedious and time consuming process that has hampered progress in low altitude UAV applications. This paper makes an important step towards automatic annotation by introducing SegProp, a novel iterative flow-based method, with a direct connection to spectral clustering in space and time, to propagate the semantic labels to frames that lack human annotations. The labels are further used in semi-supervised learning scenarios. Motivated by the lack of a large video aerial dataset, we also introduce Ruralscapes, a new dataset with high resolution (4K) images and manually-annotated dense labels every 50 frames - the largest of its kind, to the best of our knowledge. Our novel SegProp automatically annotates the remaining unlabeled 98% of frames with an accuracy exceeding 90% (F-measure), significantly outperforming other state-of-the-art label propagation methods. Moreover, when integrating other methods as modules inside SegProp's iterative label propagation loop, we achieve a significant boost over the baseline labels. Finally, we test SegProp in a full semi-supervised setting: we train several state-of-the-art deep neural networks on the SegProp-automatically-labeled training frames and test them on completely novel videos. We convincingly demonstrate, every time, a significant improvement over the supervised scenario.

下载PDF全文

下载文献需遵守相关版权规定

论文标题