通过部分模型开发的自主SPOMDP环境建模

论文标题

通过部分模型开发的自主SPOMDP环境建模

Autonomous sPOMDP Environment Modeling With Partial Model Exploitation

论文作者

Wilhelm, Andrew, Wilhelm, Aaron, Fosdick, Garrett

论文摘要

环境的状态空间表示是许多自主机器人系统使用的经典而功能强大的工具，用于有效且通常是最佳的解决方案计划。但是，以高性能设计这些表示形式是费力且昂贵的，需要为自动机器人的自动生成状态空间提供有效且多才多艺的工具。我们通过扩展了原始的基于惊喜的部分观察的马尔可夫决策过程（SPOMDP）来展示一种新颖的状态空间探索算法，并在各种环境中展示了其有效的长期探索计划绩效。通过广泛的仿真实验，我们显示了所提出的模型可显着提高原始SPOMDP学习技术的效率和可扩展性，训练速度的增长范围为31-63％，同时改善了具有较低确定性过渡的环境的鲁棒性。我们的结果为将SPOMDP解决方案扩展到更广泛的环境铺平了道路。

A state space representation of an environment is a classic and yet powerful tool used by many autonomous robotic systems for efficient and often optimal solution planning. However, designing these representations with high performance is laborious and costly, necessitating an effective and versatile tool for autonomous generation of state spaces for autonomous robots. We present a novel state space exploration algorithm by extending the original surprise-based partially-observable Markov Decision Processes (sPOMDP), and demonstrate its effective long-term exploration planning performance in various environments. Through extensive simulation experiments, we show the proposed model significantly increases efficiency and scalability of the original sPOMDP learning techniques with a range of 31-63% gain in training speed while improving robustness in environments with less deterministic transitions. Our results pave the way for extending sPOMDP solutions to a broader set of environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题