全球和本地分析能力感知的深入强化学习

论文标题

全球和本地分析能力感知的深入强化学习

Global and Local Analysis of Interestingness for Competency-Aware Deep Reinforcement Learning

论文作者

Sequeira, Pedro, Hostetler, Jesse, Gervasio, Melinda

论文摘要

近年来，深度学习的进步导致了使用强化学习（RL）在使用高维输入的复杂顺序决策任务方面取得了许多成功。但是，现有系统缺乏为人类提供对自己能力的整体看法的必要机制，从而妨碍了他们的采用，尤其是在代理商做出的决策可能会产生重大后果的关键应用中。然而，现有的基于RL的系统本质上是能力 - 统一的，因为它们缺乏必要的解释机制，无法使人类操作员对其能力有见地，整体的看法。在本文中，我们扩展了一个最近提供的可解释RL框架，该框架基于“兴趣”的分析。我们的新框架提供了来自兴趣分析的RL代理能力的各种度量，并且适用于多种RL算法。我们还提出了用于评估RL代理能力的新型机制：1）通过仅使用趣味性数据来识别药物行为模式和能力控制条件； 2）确定通过使用SHAP值执行全局和局部分析来衡量代理行为的任务元素，主要负责代理的行为。总体而言，我们的工具提供了有关RL代理能力的见解，包括其功能和局限性，使用户能够在人机合作环境中对干预措施，额外培训和其他互动做出更明智的决定。

In recent years, advances in deep learning have resulted in a plethora of successes in the use of reinforcement learning (RL) to solve complex sequential decision tasks with high-dimensional inputs. However, existing systems lack the necessary mechanisms to provide humans with a holistic view of their competence, presenting an impediment to their adoption, particularly in critical applications where the decisions an agent makes can have significant consequences. Yet, existing RL-based systems are essentially competency-unaware in that they lack the necessary interpretation mechanisms to allow human operators to have an insightful, holistic view of their competency. In this paper, we extend a recently-proposed framework for explainable RL that is based on analyses of "interestingness." Our new framework provides various measures of RL agent competence stemming from interestingness analysis and is applicable to a wide range of RL algorithms. We also propose novel mechanisms for assessing RL agents' competencies that: 1) identify agent behavior patterns and competency-controlling conditions by clustering agent behavior traces solely using interestingness data; and 2) identify the task elements mostly responsible for an agent's behavior, as measured through interestingness, by performing global and local analyses using SHAP values. Overall, our tools provide insights about RL agent competence, both their capabilities and limitations, enabling users to make more informed decisions about interventions, additional training, and other interactions in collaborative human-machine settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题