关于模型不可分解的NLP解释性中解释的粒度

论文标题

关于模型不可分解的NLP解释性中解释的粒度

On the Granularity of Explanations in Model Agnostic NLP Interpretability

论文作者

Rychener, Yves, Renard, Xavier, Seddah, Djamé, Frossard, Pascal, Detyniecki, Marcin

论文摘要

当前的黑框NLP可解释性（例如石灰或外形）的方法是基于通过删除单词和建模黑盒响应来更改文本以解释的文本。在本文中，我们概述了使用基于BERT的分类器时的限制：基于单词的采样词会产生分类器分布的文本，并进一步产生高维搜索空间，当时间或计算能力受到限制时，这是无法充分探索的。这两个挑战都可以通过将段作为NLP解释性的基本构建块来解决。作为说明，我们表明，句子的简单选择在这两个挑战上都大大改善。结果，由此产生的解释器在基准分类任务上取得了更好的保真度。

Current methods for Black-Box NLP interpretability, like LIME or SHAP, are based on altering the text to interpret by removing words and modeling the Black-Box response. In this paper, we outline limitations of this approach when using complex BERT-based classifiers: The word-based sampling produces texts that are out-of-distribution for the classifier and further gives rise to a high-dimensional search space, which can't be sufficiently explored when time or computation power is limited. Both of these challenges can be addressed by using segments as elementary building blocks for NLP interpretability. As illustration, we show that the simple choice of sentences greatly improves on both of these challenges. As a consequence, the resulting explainer attains much better fidelity on a benchmark classification task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题