分布式差分动态编程架构，用于大规模多代理控制

论文标题

分布式差分动态编程架构，用于大规模多代理控制

Distributed Differential Dynamic Programming Architectures for Large-Scale Multi-Agent Control

论文作者

Saravanos, Augustinos D., Aoyama, Yuichiro, Zhu, Hongchang, Theodorou, Evangelos A.

论文摘要

在本文中，我们提出了两个新型的分散优化框架，用于机器人技术中的多代理非线性最佳控制问题。这项工作的目的是建议继承差异动态编程（DDP）的计算效率和可扩展性的体系结构以及乘数交替方向方法的分布式性质（ADMM）。在这个方向上，引入了两个框架。第一个称为嵌套分布式DDP（ND-DDP）的第一个是三级体系结构，它采用ADMM来实现所有代理之间的共识，这是一个增强的Lagrangian层以满足局部约束和DDP作为每个代理的优化器。在第二种方法中，共识和局部约束都使用ADMM处理，产生了一种称为合并的分布式DDP（MD-DDP）的两级体系结构，这进一步降低了计算复杂性。两个框架都是完全分散的，因为所有计算在代理之间都是可行的，并且只有本地通信。仿真结果可扩展多达数千辆车和数百辆无人机，可验证方法的有效性。还说明了针对集中式DDP和集中/分散的顺序二次编程的大规模系统的卓越可扩展性。最后，在多机器人平台上进行的硬件实验证明了所提出的算法的适用性，同时强调了优化反馈策略以提高鲁棒性针对不确定性的重要性。 https://youtu.be/tluvencwldw中提供了包括所有结果的视频。

In this paper, we propose two novel decentralized optimization frameworks for multi-agent nonlinear optimal control problems in robotics. The aim of this work is to suggest architectures that inherit the computational efficiency and scalability of Differential Dynamic Programming (DDP) and the distributed nature of the Alternating Direction Method of Multipliers (ADMM). In this direction, two frameworks are introduced. The first one called Nested Distributed DDP (ND-DDP), is a three-level architecture which employs ADMM for enforcing a consensus between all agents, an augmented Lagrangian layer for satisfying local constraints and DDP as each agent's optimizer. In the second approach, both consensus and local constraints are handled with ADMM, yielding a two-level architecture called Merged Distributed DDP (MD-DDP), which further reduces computational complexity. Both frameworks are fully decentralized since all computations are parallelizable among the agents and only local communication is necessary. Simulation results that scale up to thousands of vehicles and hundreds of drones verify the effectiveness of the methods. Superior scalability to large-scale systems against centralized DDP and centralized/decentralized sequential quadratic programming is also illustrated. Finally, hardware experiments on a multi-robot platform demonstrate the applicability of the proposed algorithms, while highlighting the importance of optimizing for feedback policies to increase robustness against uncertainty. A video including all results is available in https://youtu.be/tluvENcWldw.

下载PDF全文

下载文献需遵守相关版权规定

论文标题