将地图精英缩放到深神经进化

论文标题

将地图精英缩放到深神经进化

Scaling MAP-Elites to Deep Neuroevolution

论文作者

Colas, Cédric, Huizinga, Joost, Madhavan, Vashisht, Clune, Jeff

论文摘要

事实证明，质量多样性（QD）算法和MAP-ELITE（ME）对广泛的应用非常有用，包括使真实的机器人能够从关节损伤中迅速恢复，解决强烈欺骗性的迷宫任务或逐渐发展机器人形态，以发现新步态。但是，现有的地图 - 精英和其他QD算法的实现似乎仅限于低维控制器，其参数比现代深度神经网络模型要少得多。在本文中，我们建议利用进化策略（ES）的效率来扩展地图精英到由大型神经网络参数化的高维控制器。我们设计和评估一种新的混合算法，称为MAP-ELITE具有进化策略（ME-ES），以在传统ME失败的困难高维控制任务中进行损害后恢复。此外，我们表明ME-ES在具有强烈欺骗性的高维控制任务中与最先进的探索算法相提并论。

Quality-Diversity (QD) algorithms, and MAP-Elites (ME) in particular, have proven very useful for a broad range of applications including enabling real robots to recover quickly from joint damage, solving strongly deceptive maze tasks or evolving robot morphologies to discover new gaits. However, present implementations of MAP-Elites and other QD algorithms seem to be limited to low-dimensional controllers with far fewer parameters than modern deep neural network models. In this paper, we propose to leverage the efficiency of Evolution Strategies (ES) to scale MAP-Elites to high-dimensional controllers parameterized by large neural networks. We design and evaluate a new hybrid algorithm called MAP-Elites with Evolution Strategies (ME-ES) for post-damage recovery in a difficult high-dimensional control task where traditional ME fails. Additionally, we show that ME-ES performs efficient exploration, on par with state-of-the-art exploration algorithms in high-dimensional control tasks with strongly deceptive rewards.

下载PDF全文

下载文献需遵守相关版权规定

论文标题