可解释的功能子集选择：基于沙普利价值的方法

论文标题

可解释的功能子集选择：基于沙普利价值的方法

Interpretable feature subset selection: A Shapley value based approach

论文作者

Tripathi, Sandhya, Hemachandra, N., Trivedi, Prashant

论文摘要

对于功能选择和相关问题，我们介绍了分类游戏的概念，一种合作游戏，具有玩家和基于铰链损失的特征功能，并将功能的贡献与总培训错误的基于Shapley值的错误分配（SVEA）相关联。我们的主要贡献是（$ \ star $）表明，对于任何数据集，SVEA值上的阈值0标识了特征子集，其标签预测的关节交互是重要的，或者跨越了数据主要位于数据的子空间的特征。此外，我们的方案（$ \ star $）标识了贝叶斯分类器不依赖的功能，但任何基于替代损失函数的有限示例分类器都可以。这有助于多余的$ 0 $ - $ 1 $的分类器风险，（$ \ star $）估计未知的真正铰链风险，而（$ \ star $）通过设计分类核心游戏的核心模拟，将分配的稳定性和负值SVEA的稳定性联系起来。由于Shapley Value的计算昂贵性质，我们建立在一种基于蒙特卡洛的近似算法上，该算法仅在需要时才能计算特征函数（线性程序）。我们通过提供从多个子样本获得的SVEA值的间隔估计来解决特征选择中潜在的样本偏差问题。我们说明了各种合成和真实数据集的上述所有方面，并表明我们的方案在大多数情况下都比现有的递归功能消除技术和Relieff更好。根据定义良好的特征功能，我们理论上扎根的分类游戏提供了解释性（我们在最终任务方面正式化）和框架的解释性，包括识别重要特征。

For feature selection and related problems, we introduce the notion of classification game, a cooperative game, with features as players and hinge loss based characteristic function and relate a feature's contribution to Shapley value based error apportioning (SVEA) of total training error. Our major contribution is ($\star$) to show that for any dataset the threshold 0 on SVEA value identifies feature subset whose joint interactions for label prediction is significant or those features that span a subspace where the data is predominantly lying. In addition, our scheme ($\star$) identifies the features on which Bayes classifier doesn't depend but any surrogate loss function based finite sample classifier does; this contributes to the excess $0$-$1$ risk of such a classifier, ($\star$) estimates unknown true hinge risk of a feature, and ($\star$) relate the stability property of an allocation and negative valued SVEA by designing the analogue of core of classification game. Due to Shapley value's computationally expensive nature, we build on a known Monte Carlo based approximation algorithm that computes characteristic function (Linear Programs) only when needed. We address the potential sample bias problem in feature selection by providing interval estimates for SVEA values obtained from multiple sub-samples. We illustrate all the above aspects on various synthetic and real datasets and show that our scheme achieves better results than existing recursive feature elimination technique and ReliefF in most cases. Our theoretically grounded classification game in terms of well defined characteristic function offers interpretability (which we formalize in terms of final task) and explainability of our framework, including identification of important features.

下载PDF全文

下载文献需遵守相关版权规定

论文标题