论文标题
通过保留人类定义的约束,可行且可取的反事实产生
Feasible and Desirable Counterfactual Generation by Preserving Human Defined Constraints
论文作者
论文摘要
我们提出了一种人类的方法来产生相反的(CF)解释,以保留全球和局部可行性约束。全球可行性约束是指生成可行的CF解释所必需的因果约束。假设具有有关一元和二元因果约束知识的领域专家,我们的方法可以有效地利用这些知识来通过拒绝违反这些约束的梯度步骤来产生CF解释。局部可行性约束编码最终用户的约束,以生成理想的CF说明。我们从模型的最终用户中提取这些约束,并在CF生成过程中通过用户定义的距离度量来利用它们。通过用户研究,我们证明,在CF生成过程中纳入因果约束会导致对参与者的可行性和可取性的更好解释。与仅合并全球限制相比,尽管提高用户满意度,但同时采用本地和全球可行性约束并不能显着提高参与者的可取性。
We present a human-in-the-loop approach to generate counterfactual (CF) explanations that preserve global and local feasibility constraints. Global feasibility constraints refer to the causal constraints that are necessary for generating actionable CF explanation. Assuming a domain expert with knowledge on unary and binary causal constraints, our approach efficiently employs this knowledge to generate CF explanation by rejecting gradient steps that violate these constraints. Local feasibility constraints encode end-user's constraints for generating desirable CF explanation. We extract these constraints from the end-user of the model and exploit them during CF generation via user-defined distance metric. Through user studies, we demonstrate that incorporating causal constraints during CF generation results in significantly better explanations in terms of feasibility and desirability for participants. Adopting local and global feasibility constraints simultaneously, although improves user satisfaction, does not significantly improve desirability of the participants compared to only incorporating global constraints.