论文标题

通过协作遥控学习多臂操纵

Learning Multi-Arm Manipulation Through Collaborative Teleoperation

论文作者

Tung, Albert, Wong, Josiah, Mandlekar, Ajay, Martín-Martín, Roberto, Zhu, Yuke, Fei-Fei, Li, Savarese, Silvio

论文摘要

模仿学习(IL)是一种强大的范式,可以教机器人通过允许通过远程运行收集的人类示范来学习操纵任务,但大部分仅限于单臂操纵。但是,许多现实世界中的任务需要多个手臂,例如举起重物或组装桌子。不幸的是,将IL应用于多臂操纵任务一直是具有挑战性的 - 要求人类控制一个以上的机器人手臂会施加重大的认知负担,并且通常只有两个机器人手臂才有可能。为了应对这些挑战,我们提出了多功能式机器人(MART),这是一个多用户数据收集平台,允许多个远程用户同时对一组机器人武器进行续签,并收集用于多AMM任务的演示。使用MART,我们从几个地理分离的用户中收集了五项新颖的两只和三臂任务的演示。从我们的数据中,我们获得了一个关键的见解:大多数多ARM任务在整个过程中不需要全球协调,而仅在特定时刻。我们表明,从此类数据中学习,对集中式代理提出了挑战,这些挑战直接尝试同时建模所有机器人动作,并对我们任务的集中化程度不同的不同政策体系结构进行全面研究。最后,我们提出和评估一个基本的依据政策框架,该框架允许训练有素的政策更好地适应多ARM操纵中常见的混合协调设置,并表明以分散的残留模型优于我们一组基准任务的所有其他模型。 https://roboturk.stanford.edu/multiarm上的其他结果和视频。

Imitation Learning (IL) is a powerful paradigm to teach robots to perform manipulation tasks by allowing them to learn from human demonstrations collected via teleoperation, but has mostly been limited to single-arm manipulation. However, many real-world tasks require multiple arms, such as lifting a heavy object or assembling a desk. Unfortunately, applying IL to multi-arm manipulation tasks has been challenging -- asking a human to control more than one robotic arm can impose significant cognitive burden and is often only possible for a maximum of two robot arms. To address these challenges, we present Multi-Arm RoboTurk (MART), a multi-user data collection platform that allows multiple remote users to simultaneously teleoperate a set of robotic arms and collect demonstrations for multi-arm tasks. Using MART, we collected demonstrations for five novel two and three-arm tasks from several geographically separated users. From our data we arrived at a critical insight: most multi-arm tasks do not require global coordination throughout its full duration, but only during specific moments. We show that learning from such data consequently presents challenges for centralized agents that directly attempt to model all robot actions simultaneously, and perform a comprehensive study of different policy architectures with varying levels of centralization on our tasks. Finally, we propose and evaluate a base-residual policy framework that allows trained policies to better adapt to the mixed coordination setting common in multi-arm manipulation, and show that a centralized policy augmented with a decentralized residual model outperforms all other models on our set of benchmark tasks. Additional results and videos at https://roboturk.stanford.edu/multiarm .

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源