部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning

论文作者

Guo, Jun, Chen, Yonghong, Hao, Yihang, Yin, Zixin, Yu, Yin, Li, Simin

论文摘要

尽管深度神经网络（DNNS）增强了合作多代理增强学习（C-MARL）的表现，但对抗性实例很容易扰动代理政策。考虑到C-MARL的安全性关键应用，例如交通管理，电源管理和无人机控制，在将C-MARL算法的稳健性进行现实之前，测试C-MARL算法的鲁棒性至关重要。现有的MAL的对抗攻击可以用于测试，但仅限于一个稳健性方面（例如奖励，状态，行动），而C-MARL模型可以从任何方面攻击。为了克服挑战，我们提出了Marlsafe，这是C-MARL算法的第一个鲁棒性测试框架。首先，是由马尔可夫决策过程（MDP）激励的，Marlsafe考虑了C-Marl算法的鲁棒性，从三个方面全面考虑了C-MARL算法，即表达鲁棒性，鲁棒性和奖励鲁棒性。任何C-MARL算法都必须同时满足这些鲁棒性方面才能被认为是安全的。其次，由于C-MARL攻击的稀缺性，我们提出了C-MARL攻击，作为来自多个方面的鲁棒性测试算法。 \ textIt {SMAC}环境上的实验表明，许多最新的C-MARL算法在各个方面都具有低鲁棒性，指出迫切需要测试和增强C-MARL算法的鲁棒性。

While deep neural networks (DNNs) have strengthened the performance of cooperative multi-agent reinforcement learning (c-MARL), the agent policy can be easily perturbed by adversarial examples. Considering the safety critical applications of c-MARL, such as traffic management, power management and unmanned aerial vehicle control, it is crucial to test the robustness of c-MARL algorithm before it was deployed in reality. Existing adversarial attacks for MARL could be used for testing, but is limited to one robustness aspects (e.g., reward, state, action), while c-MARL model could be attacked from any aspect. To overcome the challenge, we propose MARLSafe, the first robustness testing framework for c-MARL algorithms. First, motivated by Markov Decision Process (MDP), MARLSafe consider the robustness of c-MARL algorithms comprehensively from three aspects, namely state robustness, action robustness and reward robustness. Any c-MARL algorithm must simultaneously satisfy these robustness aspects to be considered secure. Second, due to the scarceness of c-MARL attack, we propose c-MARL attacks as robustness testing algorithms from multiple aspects. Experiments on \textit{SMAC} environment reveals that many state-of-the-art c-MARL algorithms are of low robustness in all aspect, pointing out the urgent need to test and enhance robustness of c-MARL algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题