论文标题

Phone2proc:将强大的机器人带入我们混乱的世界

Phone2Proc: Bringing Robust Robots Into Our Chaotic World

论文作者

Deitke, Matt, Hendrix, Rose, Weihs, Luca, Farhadi, Ali, Ehsani, Kiana, Kembhavi, Aniruddha

论文摘要

模拟中的培训体现的代理已成为体现的AI社区的主流。但是,由于无法将其推广到现实世界环境,这些代理在物理世界中部署时通常会挣扎。在本文中,我们提供了Phone2Proc,该方法使用10分钟的电话扫描和有条件的程序生成来创建与目标环境在语义上相似的培训场景的分布。生成的场景在扫描中的墙壁布局和大型物体的布置上进行调节,同时还采样照明,混乱,表面纹理以及具有随机放置和材料的较小物体的实例。使用Phone2Proc的培训仅利用简单的RGB摄像机,显示了SIM卡至现实的ObjectNAV性能的成功率从34.7%到70.7%的成功率。此外,Phone2Proc的生成场景的各种分布使代理人对现实世界中的变化(例如人类运动,物体重新安排,照明变化或混乱)的变化非常强大。

Training embodied agents in simulation has become mainstream for the embodied AI community. However, these agents often struggle when deployed in the physical world due to their inability to generalize to real-world environments. In this paper, we present Phone2Proc, a method that uses a 10-minute phone scan and conditional procedural generation to create a distribution of training scenes that are semantically similar to the target environment. The generated scenes are conditioned on the wall layout and arrangement of large objects from the scan, while also sampling lighting, clutter, surface textures, and instances of smaller objects with randomized placement and materials. Leveraging just a simple RGB camera, training with Phone2Proc shows massive improvements from 34.7% to 70.7% success rate in sim-to-real ObjectNav performance across a test suite of over 200 trials in diverse real-world environments, including homes, offices, and RoboTHOR. Furthermore, Phone2Proc's diverse distribution of generated scenes makes agents remarkably robust to changes in the real world, such as human movement, object rearrangement, lighting changes, or clutter.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源