论文标题
使用Wasserstein距离的可解释模型摘要
Interpretable Model Summaries Using the Wasserstein Distance
论文作者
论文摘要
统计模型通常包括数千个参数。但是,大型模型降低了研究者解释和传达估计参数的能力。在估计阶段降低参数空间的维度是一种常用的方法,但是更少的工作重点是选择用于解释估计模型的参数子集,尤其是在贝叶斯推理和模型平均等设置中。重要的是,许多模型没有直接的解释,并创建了另一层混淆。为了解决这一差距,我们引入了一种使用Wasserstein距离来识别低维的可解释模型投影的新方法。在估计复杂模型之后,用户可以预算他们希望解释多少参数,并且提出的要生成的简化模型的所需尺寸最小化完整模型的距离。我们提供模拟结果以说明该方法并将其应用于癌症数据集。
Statistical models often include thousands of parameters. However, large models decrease the investigator's ability to interpret and communicate the estimated parameters. Reducing the dimensionality of the parameter space in the estimation phase is a commonly used approach, but less work has focused on selecting subsets of the parameters for interpreting the estimated model -- especially in settings such as Bayesian inference and model averaging. Importantly, many models do not have straightforward interpretations and create another layer of obfuscation. To solve this gap, we introduce a new method that uses the Wasserstein distance to identify a low-dimensional interpretable model projection. After the estimation of complex models, users can budget how many parameters they wish to interpret and the proposed generates a simplified model of the desired dimension minimizing the distance to the full model. We provide simulation results to illustrate the method and apply it to cancer datasets.