论文标题
政策优化,稀疏的全球对比解释
Policy Optimization with Sparse Global Contrastive Explanations
论文作者
论文摘要
我们开发了增强学习(RL)框架,用于通过稀疏,用户解释的更改来改善现有行为策略。我们的目标是在获得尽可能多的收益的同时进行最小的改变。我们将最小的变化定义为原始政策和拟议的政策之间具有稀疏的全球对比解释。我们改善了当前的政策,其限制是使全球对比的解释不足。我们使用离散的MDP和连续的2D导航域来演示我们的框架。
We develop a Reinforcement Learning (RL) framework for improving an existing behavior policy via sparse, user-interpretable changes. Our goal is to make minimal changes while gaining as much benefit as possible. We define a minimal change as having a sparse, global contrastive explanation between the original and proposed policy. We improve the current policy with the constraint of keeping that global contrastive explanation short. We demonstrate our framework with a discrete MDP and a continuous 2D navigation domain.