Protovae：可信赖的自我解释的原型变分模型

论文标题

Protovae：可信赖的自我解释的原型变分模型

ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model

论文作者

Gautam, Srishti, Boubekki, Ahcene, Hansen, Stine, Salahuddin, Suaiba Amina, Jenssen, Robert, Höhne, Marina MC, Kampffmeyer, Michael

论文摘要

对可解释模型的需求促进了自我解释的分类器的发展。先前的方法是基于多阶段优化方案，影响模型的预测性能，或者产生不透明，值得信赖或不捕获数据多样性的解释。为了解决这些缺点，我们提出了Protovae，这是一种基于各种自动编码器的框架，以端到端的方式学习特定于类的原型，并通过使表示空间正规化并引入正态分子约束，从而实现可信赖和多样性。最后，该模型通过将原型直接纳入决策过程而设计为透明。与以前的自我解释方法进行了广泛的比较，证明了Protovae的优越性，强调了其产生值得信赖和多样化的解释的能力，同时又不降低预测性能。

The need for interpretable models has fostered the development of self-explainable classifiers. Prior approaches are either based on multi-stage optimization schemes, impacting the predictive performance of the model, or produce explanations that are not transparent, trustworthy or do not capture the diversity of the data. To address these shortcomings, we propose ProtoVAE, a variational autoencoder-based framework that learns class-specific prototypes in an end-to-end manner and enforces trustworthiness and diversity by regularizing the representation space and introducing an orthonormality constraint. Finally, the model is designed to be transparent by directly incorporating the prototypes into the decision process. Extensive comparisons with previous self-explainable approaches demonstrate the superiority of ProtoVAE, highlighting its ability to generate trustworthy and diverse explanations, while not degrading predictive performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题