论文标题
McXai:本地模型不合时宜的解释为两款游戏
McXai: Local model-agnostic explanation as two games
论文作者
论文摘要
直到今天,已经引入了各种提供黑盒机器学习模型的局部解释性的方法。不幸的是,所有这些方法都遭受了以下一个或多种缺陷的困扰:它们要么难以理解自己,要么以每个功能的基础工作,而忽略了功能之间的依赖关系,或者他们只专注于那些主张模型做出决定的功能。为了解决这些观点,这项工作引入了一种基于加强学习的方法,称为蒙特卡洛树搜索可解释的人工智能(MCXAI),以解释任何黑盒分类模型(分类器)的决策。我们的方法利用蒙特卡洛树搜索并建模生成解释为两款游戏的过程。在一个游戏中,通过查找支持分类器决策的功能集来最大程度地提高奖励,而在第二游戏中,找到功能集,导致替代决策可以最大化奖励。结果是作为树结构的人类友好表示,其中每个节点代表一组要研究的特征,在树的顶部使用较小的解释。我们的实验表明,我们方法发现的特征在分类方面比通过石灰和摇动等经典方法发现的特征更有信息。此外,通过识别误导性功能,我们的方法能够指导在许多情况下改善黑盒模型的鲁棒性。
To this day, a variety of approaches for providing local interpretability of black-box machine learning models have been introduced. Unfortunately, all of these methods suffer from one or more of the following deficiencies: They are either difficult to understand themselves, they work on a per-feature basis and ignore the dependencies between features and/or they only focus on those features asserting the decision made by the model. To address these points, this work introduces a reinforcement learning-based approach called Monte Carlo tree search for eXplainable Artificial Intelligent (McXai) to explain the decisions of any black-box classification model (classifier). Our method leverages Monte Carlo tree search and models the process of generating explanations as two games. In one game, the reward is maximized by finding feature sets that support the decision of the classifier, while in the second game, finding feature sets leading to alternative decisions maximizes the reward. The result is a human friendly representation as a tree structure, in which each node represents a set of features to be studied with smaller explanations at the top of the tree. Our experiments show, that the features found by our method are more informative with respect to classifications than those found by classical approaches like LIME and SHAP. Furthermore, by also identifying misleading features, our approach is able to guide towards improved robustness of the black-box model in many situations.