论文标题

急性:自动课程从简单到复杂的环境转移

ACuTE: Automatic Curriculum Transfer from Simple to Complex Environments

论文作者

Shukla, Yash, Thierauf, Christopher, Hosseini, Ramtin, Tatiya, Gyan, Sinapov, Jivko

论文摘要

尽管最近的增强学习进展(RL),但许多问题,尤其是现实世界中的任务,仍然非常昂贵。为了解决这个问题,几项研究探讨了如何将任务或数据样本本身测序到课程中,以学习一个问题,否则这些问题可能太难从头开始学习。但是,在现实情况下生成和优化课程仍然需要与环境进行广泛的互动。为了应对这一挑战,我们制定了课程转移问题,其中在更简单,易于解决的环境(例如,网格世界)中优化的课程模式被转移到复杂,现实的场景(例如,基于物理的机器人模拟或真实世界)。我们介绍了“急性”,自动课程从简单到复杂的环境转移,这是一个解决此问题的新型框架,并通过将其与其他旨在加快学习速度学习的基线方法(例如,域适应性)进行比较来评估我们提出的方法。我们观察到,即使添加任务要素进一步增加了现实情况的难度,我们的方法也会产生改善的跃点和阈值的表现。最后,我们证明我们的方法独立于用于课程生成的学习算法,并且可以使用物理机器人转移到现实世界的情况。

Despite recent advances in Reinforcement Learning (RL), many problems, especially real-world tasks, remain prohibitively expensive to learn. To address this issue, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum to learn a problem that may otherwise be too difficult to learn from scratch. However, generating and optimizing a curriculum in a realistic scenario still requires extensive interactions with the environment. To address this challenge, we formulate the curriculum transfer problem, in which the schema of a curriculum optimized in a simpler, easy-to-solve environment (e.g., a grid world) is transferred to a complex, realistic scenario (e.g., a physics-based robotics simulation or the real world). We present "ACuTE", Automatic Curriculum Transfer from Simple to Complex Environments, a novel framework to solve this problem, and evaluate our proposed method by comparing it to other baseline approaches (e.g., domain adaptation) designed to speed up learning. We observe that our approach produces improved jumpstart and time-to-threshold performance even when adding task elements that further increase the difficulty of the realistic scenario. Finally, we demonstrate that our approach is independent of the learning algorithm used for curriculum generation, and is Sim2Real transferable to a real world scenario using a physical robot.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源