在深度学习中解散模型多样性

论文标题

在深度学习中解散模型多样性

Disentangling Model Multiplicity in Deep Learning

论文作者

Heljakka, Ari, Trapp, Martin, Kannala, Juho, Solin, Arno

论文摘要

模型多样性是一种众所周知但知之甚少的现象，破坏了机器学习模型的概括保证。当两个具有相似训练时间性能的模型在其预测和现实性能特征上有所不同时，就会出现。这种观察到的“预测”多样性（PM）也意味着模型内部的差异，即它们的“表示”多重性（RM）。我们引入了一种概念性和实验设置，用于通过通过单数矢量规范相关分析（SVCCA）测量激活相似性来分析RM。我们表明，训练方法的某些差异在系统上导致RM大于其他RM，并在有限样本上评估RM和PM作为通用性的预测指标。我们将RM与i.i.d的差异测量的PM相关联。以及四个标准图像数据集中的分布式测试预测。最后，我们没有尝试消除RM，而是呼吁其系统的测量和最大暴露。

Model multiplicity is a well-known but poorly understood phenomenon that undermines the generalisation guarantees of machine learning models. It appears when two models with similar training-time performance differ in their predictions and real-world performance characteristics. This observed 'predictive' multiplicity (PM) also implies elusive differences in the internals of the models, their 'representational' multiplicity (RM). We introduce a conceptual and experimental setup for analysing RM by measuring activation similarity via singular vector canonical correlation analysis (SVCCA). We show that certain differences in training methods systematically result in larger RM than others and evaluate RM and PM over a finite sample as predictors for generalizability. We further correlate RM with PM measured by the variance in i.i.d. and out-of-distribution test predictions in four standard image data sets. Finally, instead of attempting to eliminate RM, we call for its systematic measurement and maximal exposure.

下载PDF全文

下载文献需遵守相关版权规定

论文标题