CDT：可解释的强化学习的级联决策树

论文标题

CDT：可解释的强化学习的级联决策树

CDT: Cascading Decision Trees for Explainable Reinforcement Learning

论文作者

Ding, Zihan, Hernandez-Leal, Pablo, Ding, Gavin Weiguang, Li, Changjian, Huang, Ruitong

论文摘要

深度加强学习（DRL）最近在各个领域取得了重大进步。但是，由于几个因素，解释RL代理的政策仍然是一个空旷的问题，其中一种是解释神经网络决策的复杂性。最近，一组作品使用了基于决策树的模型来学习可解释的政策。已经证明了软决策树（SDT）和离散的可区分决策树（DDT），既可以达到良好的绩效，又可以分享具有可解释政策的好处。在这项工作中，我们进一步改善了基于树的可解释RL的性能和解释性的结果。我们的建议，级联的决策树（CDT）在决策路径上采用代表性学习，以允许表现力更丰富。经验结果表明，在这两种情况下，CDT被用作策略函数近似器或模仿学习者来解释黑盒策略，CDT可以通过比SDTS更简洁，更可解释的模型来实现更好的性能。作为第二个贡献，我们的研究揭示了通过基于树的可解释模型来解释黑盒策略的局限性，这是由于其固有的不稳定性。

Deep Reinforcement Learning (DRL) has recently achieved significant advances in various domains. However, explaining the policy of RL agents still remains an open problem due to several factors, one being the complexity of explaining neural networks decisions. Recently, a group of works have used decision-tree-based models to learn explainable policies. Soft decision trees (SDTs) and discretized differentiable decision trees (DDTs) have been demonstrated to achieve both good performance and share the benefit of having explainable policies. In this work, we further improve the results for tree-based explainable RL in both performance and explainability. Our proposal, Cascading Decision Trees (CDTs) apply representation learning on the decision path to allow richer expressivity. Empirical results show that in both situations, where CDTs are used as policy function approximators or as imitation learners to explain black-box policies, CDTs can achieve better performances with more succinct and explainable models than SDTs. As a second contribution our study reveals limitations of explaining black-box policies via imitation learning with tree-based explainable models, due to its inherent instability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题