论文标题

通过增强学习的训练特征功能:xai-hethods play连接四

Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four

论文作者

Wäldchen, Stephan, Huber, Felix, Pokutta, Sebastian

论文摘要

可解释的AI(XAI)的目标之一是确定哪些输入组件与分类器决策有关。这通常被称为显着归因。特征函数(来自合作游戏理论)能够评估部分输入,并构成理论上“公平”归因方法(如Shapley值)的基础。只有标准分类器函数,尚不清楚应如何实现部分输入。取而代之的是,大多数用于黑盒分类器(例如神经网络)的XAI方法认为通常位于Manifold的反事实输入。这使他们难以评估和易于操纵。 我们提出了一个设置,以直接以神经网络的形式训练特征功能,以播放简单的两人游戏。我们通过在培训期间从代理商中随机隐藏颜色信息,将其应用于连接四的游戏。这具有比较XAI方法的三个优点:它减轻了关于如何实现部分输入的歧义,使Off-Manifold评估不必要,并允许我们通过让它们相互对抗来比较这些方法。

One of the goals of Explainable AI (XAI) is to determine which input components were relevant for a classifier decision. This is commonly know as saliency attribution. Characteristic functions (from cooperative game theory) are able to evaluate partial inputs and form the basis for theoretically "fair" attribution methods like Shapley values. Given only a standard classifier function, it is unclear how partial input should be realised. Instead, most XAI-methods for black-box classifiers like neural networks consider counterfactual inputs that generally lie off-manifold. This makes them hard to evaluate and easy to manipulate. We propose a setup to directly train characteristic functions in the form of neural networks to play simple two-player games. We apply this to the game of Connect Four by randomly hiding colour information from our agents during training. This has three advantages for comparing XAI-methods: It alleviates the ambiguity about how to realise partial input, makes off-manifold evaluation unnecessary and allows us to compare the methods by letting them play against each other.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源