混音：通过混合反复出现的软决策树朝着可解释的多代理增强学习

论文标题

混音：通过混合反复出现的软决策树朝着可解释的多代理增强学习

MIXRTs: Toward Interpretable Multi-Agent Reinforcement Learning via Mixing Recurrent Soft Decision Trees

论文作者

Liu, Zichuan, Zhu, Yuanyang, Wang, Zhi, Gao, Yang, Chen, Chunlin

论文摘要

在各个领域取得巨大成功的同时，使用黑盒神经网络的现有多代理增强学习（MARL）以不透明的方式做出决策，使人们无法理解学习知识以及输入观察如何影响决策。相比之下，现有的可解释方法通常会遭受表现力较弱和低性能的影响。为了弥合这一差距，我们建议混合经常性的软决策树（Mixrts），这是一种可解释的新型架构，可以通过根到叶子路径来表示明确的决策过程，并反映每个代理人对团队的贡献。具体而言，我们使用经常性结构构建了一种新颖的软决策树，并演示了影响决策过程的特征。然后，基于价值分解框架，我们通过明确混合单个行动值以仅使用本地观测值来估算联合行动值，从而为每个代理进行线性分配信用，从而为解释合作机制提供了新的见解。理论分析证实，混合物在关节作用值的分解中保证了添加性和单调性。对复杂任务等复杂任务和Starcraft II等复杂任务的评估表明，Mixrts与现有方法竞争，同时提供明确的解释，为可解释和高性能的MARL系统铺平了道路。

While achieving tremendous success in various fields, existing multi-agent reinforcement learning (MARL) with a black-box neural network makes decisions in an opaque manner that hinders humans from understanding the learned knowledge and how input observations influence decisions. In contrast, existing interpretable approaches usually suffer from weak expressivity and low performance. To bridge this gap, we propose MIXing Recurrent soft decision Trees (MIXRTs), a novel interpretable architecture that can represent explicit decision processes via the root-to-leaf path and reflect each agent's contribution to the team. Specifically, we construct a novel soft decision tree using a recurrent structure and demonstrate which features influence the decision-making process. Then, based on the value decomposition framework, we linearly assign credit to each agent by explicitly mixing individual action values to estimate the joint action value using only local observations, providing new insights into interpreting the cooperation mechanism. Theoretical analysis confirms that MIXRTs guarantee additivity and monotonicity in the factorization of joint action values. Evaluations on complex tasks like Spread and StarCraft II demonstrate that MIXRTs compete with existing methods while providing clear explanations, paving the way for interpretable and high-performing MARL systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题