Hinteroff：没有其他人类注释的标记数据集生成

论文标题

Hinteroff：没有其他人类注释的标记数据集生成

HandsOff: Labeled Dataset Generation With No Additional Human Annotations

论文作者

Xu, Austin, Vasileva, Mariya I., Dave, Achal, Seshadri, Arjun

论文摘要

最近的工作利用了生成对抗网络（GAN）的表达能力来生成标记的合成数据集。这些数据集生成方法通常需要对合成图像的新注释，这迫使从业者寻找注释者，策划一组合成图像，并确保生成的标签的质量。我们介绍了Handsoff框架，该技术能够在接受少于50个预先存在的标记图像的培训后，在接受培训后，能够生成无限数量的合成图像和相应的标签。我们的框架通过将GAN反转的领域与数据集生成统一，从而避免了先前工作的实际缺点。我们在多个具有挑战性的领域（例如面孔，汽车，全身人类姿势和城市驾驶场景）中生成具有丰富像素标签的数据集。与先前的数据集生成方法相比，我们的方法在语义细分，关键点检测和深度估计中实现了最新的性能。我们还展示了其在模型开发中应对广泛挑战的能力，这些挑战源于固定的手工注册数据集，例如语义细分中的长尾问题。项目页面：austinxu87.github.io/handsoff。

Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets. These dataset generation methods often require new annotations of synthetic images, which forces practitioners to seek out annotators, curate a set of synthetic images, and ensure the quality of generated labels. We introduce the HandsOff framework, a technique capable of producing an unlimited number of synthetic images and corresponding labels after being trained on less than 50 pre-existing labeled images. Our framework avoids the practical drawbacks of prior work by unifying the field of GAN inversion with dataset generation. We generate datasets with rich pixel-wise labels in multiple challenging domains such as faces, cars, full-body human poses, and urban driving scenes. Our method achieves state-of-the-art performance in semantic segmentation, keypoint detection, and depth estimation compared to prior dataset generation approaches and transfer learning baselines. We additionally showcase its ability to address broad challenges in model development which stem from fixed, hand-annotated datasets, such as the long-tail problem in semantic segmentation. Project page: austinxu87.github.io/handsoff.

下载PDF全文

下载文献需遵守相关版权规定

论文标题