论文标题
通过增强学习迈向人级的双人灵巧操纵
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
论文作者
论文摘要
实现人类水平的敏捷是机器人技术中的重要开放问题。但是,即使在婴儿级别,灵巧的手动操纵任务也是通过增强学习(RL)的挑战。困难在于高度的自由度和异质因素(例如手指关节)之间所需的合作。在这项研究中,我们提出了双人灵巧的手基准(BI-DEXHANDS),这是一种模拟器,涉及两只灵巧的手,其中包含数十只双层操纵任务和数千个目标对象。具体而言,根据认知科学文献,BI-DEXHANDS中的任务旨在匹配不同水平的人类运动技能。我们在ISSAC健身房建造了双钉手;这可以实现高效的RL培训,只有一个NVIDIA RTX 3090即可达到30,000多个fps。我们为在不同设置下的流行RL算法提供了全面的基准;这包括单代理/多代理RL,离线RL,多任务RL和META RL。我们的结果表明,PPO类型的式上的算法可以掌握简单的操纵任务,这些任务等同于48个月的人类婴儿(例如,捕获飞行的对象,打开瓶子),而多代理RL可以进一步帮助掌握需要熟练的双层合作的操纵(例如,在锅中堆叠式锅)。尽管在每个任务上都取得了成功,但在获得多个操纵技能方面,现有的RL算法在大多数多任务和少量学习设置中都无法使用,这需要从RL社区那里进行更实质性的发展。我们的项目通过https://github.com/pku-marl/dexteroushands开放。
Achieving human-level dexterity is an important open problem in robotics. However, tasks of dexterous hand manipulation, even at the baby level, are challenging to solve through reinforcement learning (RL). The difficulty lies in the high degrees of freedom and the required cooperation among heterogeneous agents (e.g., joints of fingers). In this study, we propose the Bimanual Dexterous Hands Benchmark (Bi-DexHands), a simulator that involves two dexterous hands with tens of bimanual manipulation tasks and thousands of target objects. Specifically, tasks in Bi-DexHands are designed to match different levels of human motor skills according to cognitive science literature. We built Bi-DexHands in the Issac Gym; this enables highly efficient RL training, reaching 30,000+ FPS by only one single NVIDIA RTX 3090. We provide a comprehensive benchmark for popular RL algorithms under different settings; this includes Single-agent/Multi-agent RL, Offline RL, Multi-task RL, and Meta RL. Our results show that the PPO type of on-policy algorithms can master simple manipulation tasks that are equivalent up to 48-month human babies (e.g., catching a flying object, opening a bottle), while multi-agent RL can further help to master manipulations that require skilled bimanual cooperation (e.g., lifting a pot, stacking blocks). Despite the success on each single task, when it comes to acquiring multiple manipulation skills, existing RL algorithms fail to work in most of the multi-task and the few-shot learning settings, which calls for more substantial development from the RL community. Our project is open sourced at https://github.com/PKU-MARL/DexterousHands.