将先验知识纳入事后解释

论文标题

将先验知识纳入事后解释

Integrating Prior Knowledge in Post-hoc Explanations

论文作者

Jeyasothy, Adulam, Laugel, Thibault, Lesot, Marie-Jeanne, Marsala, Christophe, Detyniecki, Marcin

论文摘要

在可解释的人工智能（XAI）领域，事后解释性方法旨在向用户解释训练有素的决策模型的预测。将先验知识整合到这种解释性方法中旨在提高解释可理解性，并允许适合每个用户的个性化解释。在本文中，我们建议定义一个成本函数，该成本函数将先验知识明确整合到解释性目标中：我们为事后可解释性方法的优化问题提供了一个一般框架，并证明可以通过在成本函数中添加兼容性项来将用户知识添加到任何方法中。在反事实解释的情况下，我们实例化了提议的形式化，并提出了一种新的可解释性方法，称为知识集成在反事实解释（KICE）中以优化它。与参考方法相比，该论文对几个基准数据集进行了一项实验研究，以表征KICE产生的反事实实例。

In the field of eXplainable Artificial Intelligence (XAI), post-hoc interpretability methods aim at explaining to a user the predictions of a trained decision model. Integrating prior knowledge into such interpretability methods aims at improving the explanation understandability and allowing for personalised explanations adapted to each user. In this paper, we propose to define a cost function that explicitly integrates prior knowledge into the interpretability objectives: we present a general framework for the optimization problem of post-hoc interpretability methods, and show that user knowledge can thus be integrated to any method by adding a compatibility term in the cost function. We instantiate the proposed formalization in the case of counterfactual explanations and propose a new interpretability method called Knowledge Integration in Counterfactual Explanation (KICE) to optimize it. The paper performs an experimental study on several benchmark data sets to characterize the counterfactual instances generated by KICE, as compared to reference methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题