可解释预测的最佳本地解释器聚合

论文标题

可解释预测的最佳本地解释器聚合

Optimal Local Explainer Aggregation for Interpretable Prediction

论文作者

Li, Qiaomei, Cummings, Rachel, Mintz, Yonatan

论文摘要

在将黑框机器学习模型纳入实践中时，决策者的主要挑战是能够理解这些模型提供的预测。一组建议的方法是训练替代解释器模型，该模型近似于更复杂的模型。解释器方法通常被归类为本地或全局，具体取决于他们声称要解释的数据空间的哪些部分。全球解释者的覆盖范围通常是以解释者保真度为代价的。交易两种方法的优势的一种方式是将几个本地解释器汇总为具有改进覆盖范围的单个解释器模型。但是，汇总这些本地解释器的问题在计算上具有挑战性，现有方法仅使用启发式方法来形成这些聚合。在本文中，我们提出了一种局部解释器聚合方法，该方法使用非凸优化选择局部解释器。与其他启发式方法相反，我们使用整数优化框架将本地解释器结合到近乎全球的骨料解释器中。我们的框架使决策者能够通过优化问题的参数直接权衡所得汇总的覆盖范围和保真度。我们还基于信息过滤提出了一种新型的本地解释器算法。我们在两个医疗保健数据集上评估了我们的算法框架 - 帕金森的进步标记倡议（PPMI）数据集和一个老年移动性数据集 - 这是由于预期需要解释的精确药物的动机。我们的方法的表现优于现有的本地解释器聚合方法，从忠诚度和分类覆盖范围方面，对现有全球解释器方法的忠诚度提高了，尤其是在最先进的方法实现70％并且我们的实现90％的多级别设置中。

A key challenge for decision makers when incorporating black box machine learned models into practice is being able to understand the predictions provided by these models. One proposed set of methods is training surrogate explainer models which approximate the more complex model. Explainer methods are generally classified as either local or global, depending on what portion of the data space they are purported to explain. The improved coverage of global explainers usually comes at the expense of explainer fidelity. One way of trading off the advantages of both approaches is to aggregate several local explainers into a single explainer model with improved coverage. However, the problem of aggregating these local explainers is computationally challenging, and existing methods only use heuristics to form these aggregations. In this paper we propose a local explainer aggregation method which selects local explainers using non-convex optimization. In contrast to other heuristic methods, we use an integer optimization framework to combine local explainers into a near-global aggregate explainer. Our framework allows a decision-maker to directly tradeoff coverage and fidelity of the resulting aggregation through the parameters of the optimization problem. We also propose a novel local explainer algorithm based on information filtering. We evaluate our algorithmic framework on two healthcare datasets---the Parkinson's Progression Marker Initiative (PPMI) data set and a geriatric mobility dataset---which is motivated by the anticipated need for explainable precision medicine. Our method outperforms existing local explainer aggregation methods in terms of both fidelity and coverage of classification and improves on fidelity over existing global explainer methods, particularly in multi-class settings where state-of-the-art methods achieve 70% and ours achieves 90%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题