通过衡量合作多代理RL的角色多样性来诊断政策诊断

论文标题

通过衡量合作多代理RL的角色多样性来诊断政策诊断

Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL

论文作者

Hu, Siyi, Xie, Chuanlong, Liang, Xiaodan, Chang, Xiaojun

论文摘要

合作的多代理增强学习（MARL）正在快速进步，以在网格世界和现实世界中解决任务，在该方案中，赋予代理的不同属性和目标，从而通过整个多代理任务导致不同的行为。在这项研究中，我们通过{\ bf角色多样性}来量化代理人的行为差异，并建立与政策绩效的关系，这是一种测量MARL任务特征的指标。我们从三个角度定义了角色多样性：基于动作的，基于轨迹的和基于贡献的角色，以完全衡量多代理任务。通过理论分析，我们发现MAL中绑定的误差可以分解为与角色多样性有很强关系的三个部分。分解的因素可以显着影响政策优化，包括参数共享，沟通机制和信贷分配的三个流行方向。主要的实验平台基于{\ bf多基因粒子环境（MPE）}和{\ bf星际争霸多代理挑战（SMAC）。广泛的实验}清楚地表明，角色多样性可以作为对多机构合作任务的特征的强大度量，并有助于诊断该政策是否适合当前的多代理系统，以获得更好的政策绩效。

Cooperative multi-agent reinforcement learning (MARL) is making rapid progress for solving tasks in a grid world and real-world scenarios, in which agents are given different attributes and goals, resulting in different behavior through the whole multi-agent task. In this study, we quantify the agent's behavior difference and build its relationship with the policy performance via {\bf Role Diversity}, a metric to measure the characteristics of MARL tasks. We define role diversity from three perspectives: action-based, trajectory-based, and contribution-based to fully measure a multi-agent task. Through theoretical analysis, we find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity. The decomposed factors can significantly impact policy optimization on three popular directions including parameter sharing, communication mechanism, and credit assignment. The main experimental platforms are based on {\bf Multiagent Particle Environment (MPE)} and {\bf The StarCraft Multi-Agent Challenge (SMAC). Extensive experiments} clearly show that role diversity can serve as a robust measurement for the characteristics of a multi-agent cooperation task and help diagnose whether the policy fits the current multi-agent system for a better policy performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题