具有模块化深入学习和政策转移的适应性自动化

论文标题

具有模块化深入学习和政策转移的适应性自动化

Adaptable Automation with Modular Deep Reinforcement Learning and Policy Transfer

论文作者

Raziei, Zohreh, Moghaddam, Mohsen

论文摘要

深度强化学习（RL）的最新进展为智能自动化创造了前所未有的机会，在此机器可以自主学习执行给定任务的最佳政策。但是，当前的深度RL算法主要专注于狭窄的任务，样本效率低下，并且缺乏足够的稳定性，这反过来又阻碍了他们的工业采用。本文通过开发和测试基于任务模块化和转移学习的概念来制定和测试超级演员 - 批判性（HASAC）RL框架来解决这一限制。拟议的HASAC的目的是通过将以前任务的学习政策通过“超级演员”转移到新任务中，从而增强代理对新任务的适应性。 HASAC框架在新的虚拟机器人操纵基准Meta-World上进行了测试。数值实验表明，就奖励价值，成功率和任务完成时间而言，HASAC的性能优于最先进的深度RL算法。

Recent advances in deep Reinforcement Learning (RL) have created unprecedented opportunities for intelligent automation, where a machine can autonomously learn an optimal policy for performing a given task. However, current deep RL algorithms predominantly specialize in a narrow range of tasks, are sample inefficient, and lack sufficient stability, which in turn hinder their industrial adoption. This article tackles this limitation by developing and testing a Hyper-Actor Soft Actor-Critic (HASAC) RL framework based on the notions of task modularization and transfer learning. The goal of the proposed HASAC is to enhance the adaptability of an agent to new tasks by transferring the learned policies of former tasks to the new task via a "hyper-actor". The HASAC framework is tested on a new virtual robotic manipulation benchmark, Meta-World. Numerical experiments show superior performance by HASAC over state-of-the-art deep RL algorithms in terms of reward value, success rate, and task completion time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题