metagraspnet：一个大规模的基准数据集，用于通过基于物理的元元合成，用于场景感知的Ambidextrous bin

论文标题

metagraspnet：一个大规模的基准数据集，用于通过基于物理的元元合成，用于场景感知的Ambidextrous bin

MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware Ambidextrous Bin Picking via Physics-based Metaverse Synthesis

论文作者

Gilles, Maximilian, Chen, Yuhao, Winter, Tim Robin, Zeng, E. Zhixuan, Wong, Alexander

论文摘要

鉴于问题的复杂性，从各种传感器方式到高度纠缠的对象布局，再到多样化的项目属性和抓地力类型，因此对视觉驱动的机器人系统提出了重大挑战。现有方法通常从一个角度解决问题。各种项目和复杂的垃圾箱场景需要多种选择策略以及高级推理。因此，要构建可靠的机器学习算法来解决这项复杂的任务，就需要大量的全面和高质量的数据。在现实世界中收集此类数据将太昂贵，时间过时，因此从可扩展性的角度来看。为了解决这个大型，多样化的数据问题，我们从最近的元概念上的增长中汲取灵感，并引入MetagraspNet，MetagraspNet是一种通过基于物理学的元合成构建的大规模照相式bin挑选数据集。提出的数据集在82种不同的文章类型上包含217K RGBD图像，并具有完整的注释，可用于对象检测，Amodal感知，关键点检测，操纵顺序以及平行jaw和真空抓地力的Ambidextrous Grasp标签。我们还提供了一个由2.3k超过2.3k完全注释的高质量RGBD图像组成的真实数据集，分为5个困难级别，以及一个看不见的对象，以评估不同的对象和布局属性。最后，我们进行了广泛的实验，表明我们提出的真空密封模型和合成数据集实现了最新的性能，并将其推广到现实世界用例。

Autonomous bin picking poses significant challenges to vision-driven robotic systems given the complexity of the problem, ranging from various sensor modalities, to highly entangled object layouts, to diverse item properties and gripper types. Existing methods often address the problem from one perspective. Diverse items and complex bin scenes require diverse picking strategies together with advanced reasoning. As such, to build robust and effective machine-learning algorithms for solving this complex task requires significant amounts of comprehensive and high quality data. Collecting such data in real world would be too expensive and time prohibitive and therefore intractable from a scalability perspective. To tackle this big, diverse data problem, we take inspiration from the recent rise in the concept of metaverses, and introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis. The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper. We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties. Finally, we conduct extensive experiments showing that our proposed vacuum seal model and synthetic dataset achieves state-of-the-art performance and generalizes to real world use-cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题