域清理：简化图像以减轻合成真实域移动并改善深度估计

论文标题

域清理：简化图像以减轻合成真实域移动并改善深度估计

Domain Decluttering: Simplifying Images to Mitigate Synthetic-Real Domain Shift and Improve Depth Estimation

论文作者

Zhao, Yunhan, Kong, Shu, Shin, Daeyun, Fowlkes, Charless

论文摘要

利用合成的渲染数据为改善单眼深度估计和其他几何估计任务提供了巨大的潜力，但是关闭合成真实的域间隙是一项非平凡且重要的任务。尽管最近的许多工作集中在无监督的域适应上，但我们考虑了一个更现实的场景，其中大量的合成训练数据被带有地面真相的一小部分真实图像补充。在这种情况下，我们发现现有的域翻译方法很难训练，并且比使用真实和合成数据的简单基线几乎没有优势。关键故障模式是，现实世界图像包含新颖的对象和合成训练中不存在的混乱。现有的图像翻译模型无法处理此高级域移动。基于这些观察结果，我们开发了一个注意模块，该模块学会了以识别和消除真实图像中困难的室外区域，以改善主要是对合成数据训练的模型的深度预测。我们进行了广泛的实验，以验证我们的参加示例完成方法（ARC），并发现它在深度预测中的最先进域适应方法大大优于最先进的域适应方法。可视化去除的区域提供了对合成真实域间隙的可解释见解。

Leveraging synthetically rendered data offers great potential to improve monocular depth estimation and other geometric estimation tasks, but closing the synthetic-real domain gap is a non-trivial and important task. While much recent work has focused on unsupervised domain adaptation, we consider a more realistic scenario where a large amount of synthetic training data is supplemented by a small set of real images with ground-truth. In this setting, we find that existing domain translation approaches are difficult to train and offer little advantage over simple baselines that use a mix of real and synthetic data. A key failure mode is that real-world images contain novel objects and clutter not present in synthetic training. This high-level domain shift isn't handled by existing image translation models. Based on these observations, we develop an attention module that learns to identify and remove difficult out-of-domain regions in real images in order to improve depth prediction for a model trained primarily on synthetic data. We carry out extensive experiments to validate our attend-remove-complete approach (ARC) and find that it significantly outperforms state-of-the-art domain adaptation methods for depth prediction. Visualizing the removed regions provides interpretable insights into the synthetic-real domain gap.

下载PDF全文

下载文献需遵守相关版权规定

论文标题