对空中和卫星图像中建筑区域细分的弱监督域的适应

论文标题

对空中和卫星图像中建筑区域细分的弱监督域的适应

Weakly Supervised Domain Adaptation for Built-up Region Segmentation in Aerial and Satellite Imagery

论文作者

Iqbal, Javed, Ali, Mohsen

论文摘要

本文提出了一种新型的域适应算法，以应对卫星和空中图像所带来的挑战，并证明其对建筑区域分割问题的有效性。建立区域估计是了解人类对环境的影响，公共政策的影响以及一般城市人口分析的重要组成部分。空中和卫星图像的多样性以及缺乏标记的数据涵盖了这种多样性，因此机器学习算法难以推广到此类任务，尤其是在多个领域。另一方面，由于缺乏强大的空间环境和结构，与地面图像相比，现有的无监督域适应方法的应用导致了亚最佳适应性。我们彻底研究了现有域适应方法的局限性，并提出了一种弱监督的适应策略，我们假设图像级标签可用于目标域。更具体地说，我们设计了一个构建的区域分割网络（作为编码器），并添加了图像分类头来指导改编。设计的系统能够解决多个卫星和空中图像数据集中视觉差异的问题，从高分辨率（HR）到非常高分辨率（VHR）。通过手动标记卢旺达的73.4平方公里，捕获了各种不同地形的各种建筑结构，创建了一个现实且具有挑战性的人力资源数据集。与现有数据集相比，开发的数据集在空间上是富裕的，并且涵盖了各种建筑情景，包括森林和沙漠，泥房，锡，锡和彩色屋顶的建筑区域。通过从单源域进行调整以分割目标域来进行广泛的实验。在现有的最新方法上，我们获得了11.6％-52％的高收益。

This paper proposes a novel domain adaptation algorithm to handle the challenges posed by the satellite and aerial imagery, and demonstrates its effectiveness on the built-up region segmentation problem. Built-up area estimation is an important component in understanding the human impact on the environment, the effect of public policy, and general urban population analysis. The diverse nature of aerial and satellite imagery and lack of labeled data covering this diversity makes machine learning algorithms difficult to generalize for such tasks, especially across multiple domains. On the other hand, due to the lack of strong spatial context and structure, in comparison to the ground imagery, the application of existing unsupervised domain adaptation methods results in the sub-optimal adaptation. We thoroughly study the limitations of existing domain adaptation methods and propose a weakly-supervised adaptation strategy where we assume image-level labels are available for the target domain. More specifically, we design a built-up area segmentation network (as encoder-decoder), with an image classification head added to guide the adaptation. The devised system is able to address the problem of visual differences in multiple satellite and aerial imagery datasets, ranging from high resolution (HR) to very high resolution (VHR). A realistic and challenging HR dataset is created by hand-tagging the 73.4 sq-km of Rwanda, capturing a variety of build-up structures over different terrain. The developed dataset is spatially rich compared to existing datasets and covers diverse built-up scenarios including built-up areas in forests and deserts, mud houses, tin, and colored rooftops. Extensive experiments are performed by adapting from the single-source domain, to segment out the target domain. We achieve high gains ranging 11.6%-52% in IoU over the existing state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题