一种可解释的概率方法，用于揭开黑盒预测模型的神秘面纱

论文标题

一种可解释的概率方法，用于揭开黑盒预测模型的神秘面纱

An Interpretable Probabilistic Approach for Demystifying Black-box Predictive Models

论文作者

Moreira, Catarina, Chou, Yu-Liang, Velmurugan, Mythreyi, Ouyang, Chun, Sindhgatta, Renuka, Bruza, Peter

论文摘要

将复杂的机器学习模型用于关键决策面临的挑战是，这些模型通常被用作“黑框”。这导致人们对可解释的机器学习产生了越来越多的兴趣，在此事后解释中提出了一种有用的机制来产生对复杂学习模型的解释。在本文中，我们提出了一种由贝叶斯网络扩展框架的新方法，用于生成黑盒预测模型的事后解释。该框架支持提取贝叶斯网络作为特定预测的黑框模型的近似值。与现有的事后解释方法相比，我们的方法的贡献是三倍。首先，作为概率图形模型，提取的贝叶斯网络不仅可以提供有关输入功能的解释，而且还可以提供这些特征有助于预测的原因。其次，对于具有许多功能的复杂决策问题，可以从提取的贝叶斯网络中生成马尔可夫毛毯，以提供对直接有助于预测的那些输入特征的重点视图的解释。第三，提取的贝叶斯网络可以识别四个不同的规则，这些规则可以告知决策者预测中的置信度，从而帮助决策者评估黑盒模型学到的预测的可靠性。我们实施了提出的方法，并在两个著名的公共数据集的背景下应用了它，并分析了这些结果，这些结果可在开源存储库中提供。

The use of sophisticated machine learning models for critical decision making is faced with a challenge that these models are often applied as a "black-box". This has led to an increased interest in interpretable machine learning, where post hoc interpretation presents a useful mechanism for generating interpretations of complex learning models. In this paper, we propose a novel approach underpinned by an extended framework of Bayesian networks for generating post hoc interpretations of a black-box predictive model. The framework supports extracting a Bayesian network as an approximation of the black-box model for a specific prediction. Compared to the existing post hoc interpretation methods, the contribution of our approach is three-fold. Firstly, the extracted Bayesian network, as a probabilistic graphical model, can provide interpretations about not only what input features but also why these features contributed to a prediction. Secondly, for complex decision problems with many features, a Markov blanket can be generated from the extracted Bayesian network to provide interpretations with a focused view on those input features that directly contributed to a prediction. Thirdly, the extracted Bayesian network enables the identification of four different rules which can inform the decision-maker about the confidence level in a prediction, thus helping the decision-maker assess the reliability of predictions learned by a black-box model. We implemented the proposed approach, applied it in the context of two well-known public datasets and analysed the results, which are made available in an open-source repository.

下载PDF全文

下载文献需遵守相关版权规定

论文标题