关于稀疏的反事实解释的鲁棒性

论文标题

关于稀疏的反事实解释的鲁棒性

On the Robustness of Sparse Counterfactual Explanations to Adverse Perturbations

论文作者

Virgolin, Marco, Fracaros, Saverio

论文摘要

反事实解释（CES）是了解如何更改算法的决策的强大手段。研究人员提出了许多Desiderata，CES应该满足实际上有用，例如需要最少的努力来制定或遵守因果模型。我们考虑了提高CES可用性的另一个方面：对不良扰动的鲁棒性，这可能是由于不幸的情况而自然发生的。由于CES通常会规定干预的稀疏形式（即，仅应更改特征的子集），因此我们研究了针对建议更改的特征和不进行的特征分别解决鲁棒性的效果。我们的定义是可行的，因为它们可以将其作为罚款术语纳入用于发现CES的损失功能。为了实验鲁棒性，我们创建和发布代码，其中五个数据集（通常在公平和可解释的机器学习领域使用）具有特定于功能的注释，可用于采样有意义的扰动。我们的实验表明，CES通常不健壮，如果发生不良扰动（即使不是最坏的情况），他们规定的干预措施可能需要比预期的要大得多，甚至变得不可能。但是，考虑搜索过程中的鲁棒性，可以很容易地完成，可以系统地发现健壮的CES。强大的CE提供了额外的干预措施，以对比度扰动的成本远低于非稳定CE。我们还发现，鲁棒性更容易实现功能更改，这为选择哪种反事实解释最适合用户提出了重要的考虑点。我们的代码可在以下网址获得：https：//github.com/marcovirgolin/robust-counterfactuals。

Counterfactual explanations (CEs) are a powerful means for understanding how decisions made by algorithms can be changed. Researchers have proposed a number of desiderata that CEs should meet to be practically useful, such as requiring minimal effort to enact, or complying with causal models. We consider a further aspect to improve the usability of CEs: robustness to adverse perturbations, which may naturally happen due to unfortunate circumstances. Since CEs typically prescribe a sparse form of intervention (i.e., only a subset of the features should be changed), we study the effect of addressing robustness separately for the features that are recommended to be changed and those that are not. Our definitions are workable in that they can be incorporated as penalty terms in the loss functions that are used for discovering CEs. To experiment with robustness, we create and release code where five data sets (commonly used in the field of fair and explainable machine learning) have been enriched with feature-specific annotations that can be used to sample meaningful perturbations. Our experiments show that CEs are often not robust and, if adverse perturbations take place (even if not worst-case), the intervention they prescribe may require a much larger cost than anticipated, or even become impossible. However, accounting for robustness in the search process, which can be done rather easily, allows discovering robust CEs systematically. Robust CEs make additional intervention to contrast perturbations much less costly than non-robust CEs. We also find that robustness is easier to achieve for the features to change, posing an important point of consideration for the choice of what counterfactual explanation is best for the user. Our code is available at: https://github.com/marcovirgolin/robust-counterfactuals.

下载PDF全文

下载文献需遵守相关版权规定

论文标题