深度强化学习中的解释性

论文标题

深度强化学习中的解释性

Explainability in Deep Reinforcement Learning

论文作者

Heuillet, Alexandre, Couthouis, Fabien, Díaz-Rodríguez, Natalia

论文摘要

一系列可解释的人工智能（XAI）文献正在出现在特征相关技术上，以解释深度神经网络（DNN）输出或解释摄入图像源数据的模型。但是，评估XAI技术如何帮助了解分类任务以外的模型，例如对于增强学习（RL），尚未进行广泛的研究。我们回顾了朝着获得可解释的强化学习（XRL）的指导的最新作品，这是一个相对较新的可解释人工智能的子场，旨在用于一般公共应用中，具有不同的受众，需要道德，负责任和可信赖的算法。在关键的情况下，必须证明和解释代理人的行为是必要的，RL模型的更好解释性和解释性可以帮助获得有关仍然被认为是黑匣子的内部运作的科学见解。我们主要评估将解释性与RL联系起来的研究，并根据解释的生成方式将它们分为两类：透明算法和事后解释性。我们还回顾了镜头中最杰出的XAI作品，这些镜头如何在日常问题的当前和未来中进一步启发RL最新进展的进一步部署。

A large set of the explainable Artificial Intelligence (XAI) literature is emerging on feature relevance techniques to explain a deep neural network (DNN) output or explaining models that ingest image source data. However, assessing how XAI techniques can help understand models beyond classification tasks, e.g. for reinforcement learning (RL), has not been extensively studied. We review recent works in the direction to attain Explainable Reinforcement Learning (XRL), a relatively new subfield of Explainable Artificial Intelligence, intended to be used in general public applications, with diverse audiences, requiring ethical, responsible and trustable algorithms. In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box. We evaluate mainly studies directly linking explainability to RL, and split these into two categories according to the way the explanations are generated: transparent algorithms and post-hoc explainaility. We also review the most prominent XAI works from the lenses of how they could potentially enlighten the further deployment of the latest advances in RL, in the demanding present and future of everyday problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题