观察空间很重要：基准和优化算法

论文标题

观察空间很重要：基准和优化算法

Observation Space Matters: Benchmark and Optimization Algorithm

论文作者

Kim, Joanne Taery, Ha, Sehoon

论文摘要

深度强化学习（DEEP RL）的最新进展使研究人员能够解决挑战性的控制问题，从模拟环境到现实世界的机器人任务。但是，已知深度RL算法对问题的表述敏感，包括观察空间，动作空间和奖励功能。有许多观察空间的选择，但由于缺乏既定原则，它们通常仅根据先验知识而设计。在这项工作中，我们进行了基准实验，以验证观察空间的共同设计选择，例如笛卡尔转换，二进制接触旗，短期历史或全球位置。然后，我们提出了一种搜索算法来找到最佳的观测空间，该算法检查了各种候选观测空间，并通过辍学测试来删除不必要的观察通道。我们证明，与手动设计的观察空间相比，我们的算法显着提高了学习速度。我们还通过评估不同的超参数来分析提出的算法。

Recent advances in deep reinforcement learning (deep RL) enable researchers to solve challenging control problems, from simulated environments to real-world robotic tasks. However, deep RL algorithms are known to be sensitive to the problem formulation, including observation spaces, action spaces, and reward functions. There exist numerous choices for observation spaces but they are often designed solely based on prior knowledge due to the lack of established principles. In this work, we conduct benchmark experiments to verify common design choices for observation spaces, such as Cartesian transformation, binary contact flags, a short history, or global positions. Then we propose a search algorithm to find the optimal observation spaces, which examines various candidate observation spaces and removes unnecessary observation channels with a Dropout-Permutation test. We demonstrate that our algorithm significantly improves learning speed compared to manually designed observation spaces. We also analyze the proposed algorithm by evaluating different hyperparameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题