SIM到现实的转移，具有增量环境复杂性，以增强基于深度的机器人导航

论文标题

SIM到现实的转移，具有增量环境复杂性，以增强基于深度的机器人导航

Sim-to-Real Transfer with Incremental Environment Complexity for Reinforcement Learning of Depth-Based Robot Navigation

论文作者

Chaffre, Thomas, Moras, Julien, Chan-Hon-Tong, Adrien, Marzat, Julien

论文摘要

将基于学习的模型转移到现实世界中仍然是无模型控制理论中最困难的问题之一。由于在真实机器人上收集数据的成本以及深钢筋学习算法的样本效率有限，因此通常在模拟器中训练模型，从理论上讲，该模型提供了无限量的数据。尽管提供了无限的试验和错误运行，但模拟与物理世界之间的现实差距几乎没有保证实际操作中的政策行为。根据问题，可能需要昂贵的实际微调和/或复杂的域随机策略来制定相关政策。在本文中，提出了一种使用增量环境复杂性的软演员评论家（SAC）培训策略，以大大减少对现实世界中额外培训的需求。解决的应用程序是基于深度的无MAP导航，其中移动机器人应在没有事先映射信息的情况下在混乱的环境中到达给定的路点。提出了模拟和实际环境中的实验结果，以定量评估所提出的方法的效率，该方法表明，成功率是幼稚策略的两倍。

Transferring learning-based models to the real world remains one of the hardest problems in model-free control theory. Due to the cost of data collection on a real robot and the limited sample efficiency of Deep Reinforcement Learning algorithms, models are usually trained in a simulator which theoretically provides an infinite amount of data. Despite offering unbounded trial and error runs, the reality gap between simulation and the physical world brings little guarantee about the policy behavior in real operation. Depending on the problem, expensive real fine-tuning and/or a complex domain randomization strategy may be required to produce a relevant policy. In this paper, a Soft-Actor Critic (SAC) training strategy using incremental environment complexity is proposed to drastically reduce the need for additional training in the real world. The application addressed is depth-based mapless navigation, where a mobile robot should reach a given waypoint in a cluttered environment with no prior mapping information. Experimental results in simulated and real environments are presented to assess quantitatively the efficiency of the proposed approach, which demonstrated a success rate twice higher than a naive strategy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题