多代理路径规划中的空间编码的长期短期记忆

论文标题

多代理路径规划中的空间编码的长期短期记忆

Long Short-Term Memory for Spatial Encoding in Multi-Agent Path Planning

论文作者

Schlichting, Marc R., Notter, Stefan, Fichter, Walter

论文摘要

基于强化的学习规模的基于学习的路径规划变化尺寸构成了一个研究主题，随着城市空气流动性和自动驾驶汽车等领域的进展的持续，其重要性越来越重要。使用连续状态和行动空间进行的加强学习用于训练适合理想的路径计划行为的政策网络，可用于至关重要的应用程序。提出了一个长期的短期内存模块，以编码一个未指定数量的状态，以无限数量的代理数量。所描述的培训策略和政策体系结构导致了一项指导，尽管培训的规模较小，但仍能扩展到无限数量的代理和无限的物理维度。该指南是在低成本，现成的板载计算机上实施的。通过呈现多达四个无人机的飞行测试结果，可以自主在现实环境中自动导航无碰撞，从而验证了所提出的方法的可行性。

Reinforcement learning-based path planning for multi-agent systems of varying size constitutes a research topic with increasing significance as progress in domains such as urban air mobility and autonomous aerial vehicles continues. Reinforcement learning with continuous state and action spaces is used to train a policy network that accommodates desirable path planning behaviors and can be used for time-critical applications. A Long Short-Term Memory module is proposed to encode an unspecified number of states for a varying, indefinite number of agents. The described training strategies and policy architecture lead to a guidance that scales to an infinite number of agents and unlimited physical dimensions, although training takes place at a smaller scale. The guidance is implemented on a low-cost, off-the-shelf onboard computer. The feasibility of the proposed approach is validated by presenting flight test results of up to four drones, autonomously navigating collision-free in a real-world environment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题