增强数据的增强学习

论文标题

增强数据的增强学习

Reinforcement Learning with Augmented Data

论文作者

Laskin, Michael, Lee, Kimin, Stooke, Adam, Pinto, Lerrel, Abbeel, Pieter, Srinivas, Aravind

论文摘要

从视觉观察中学习是强化学习（RL）的一个基本而又具有挑战性的问题。尽管事实证明，算法进步与卷积神经网络相结合是成功的秘诀，但目前的方法仍缺乏两个方面：（a）学习的数据效率以及（b）对新环境的概括。为此，我们使用增强数据（RAD）介绍了增强学习，这是一个简单的插件模块，可以增强大多数RL算法。我们对基于像素的输入和基于状态的输入进行了首次对RL的通用数据增强进行的广泛研究，并引入了两个新的数据增强 - 随机翻译和随机幅度量表。我们表明，诸如随机翻译，农作物，颜色抖动，贴片切口，随机卷积和振幅量表等增强量可以使简单的RL算法胜过跨常见基准测试的复杂最新方法。 Rad在DeepMind Control Suite基准测试基准的数据效率和最终性能方面为基于像素的控制以及OpenAI Gym基准测试了基于州基于州的控制的新最新性能。我们进一步证明，RAD可显着改善对几种OpenAI Procgen基准测试的现有方法的测试时间概括。我们的RAD模块和培训代码可在https://www.github.com/mishalaskin/rad上找到。

Learning from visual observations is a fundamental yet challenging problem in Reinforcement Learning (RL). Although algorithmic advances combined with convolutional neural networks have proved to be a recipe for success, current methods are still lacking on two fronts: (a) data-efficiency of learning and (b) generalization to new environments. To this end, we present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms. We perform the first extensive study of general data augmentations for RL on both pixel-based and state-based inputs, and introduce two new data augmentations - random translate and random amplitude scale. We show that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods across common benchmarks. RAD sets a new state-of-the-art in terms of data-efficiency and final performance on the DeepMind Control Suite benchmark for pixel-based control as well as OpenAI Gym benchmark for state-based control. We further demonstrate that RAD significantly improves test-time generalization over existing methods on several OpenAI ProcGen benchmarks. Our RAD module and training code are available at https://www.github.com/MishaLaskin/rad.

下载PDF全文

下载文献需遵守相关版权规定

论文标题