论文标题
研究无限混合物的最大似然训练以进行不确定性定量
Investigating maximum likelihood based training of infinite mixtures for uncertainty quantification
论文作者
论文摘要
在过去的几年中,神经网络中的不确定性量化引起了很多关注。最受欢迎的方法是贝叶斯神经网络(BNNS),蒙特卡洛辍学者和深层合奏有一个共同点:它们都是基于某种混合模型的。尽管BNN构建了无限混合模型并通过变异推断得出,但后两个构建了用最大似然方法训练的有限混合物。在这项工作中,我们研究了使用最大似然法而不是变异推理训练无限混合物分布的效果。我们发现,提出的目标导致随机网络具有增加的预测差异,这与具有等效网络结构的标准BNN相比,基于不确定性的识别识别识别和鲁棒性针对对抗性攻击。新模型还显示出较高的分布数据熵。
Uncertainty quantification in neural networks gained a lot of attention in the past years. The most popular approaches, Bayesian neural networks (BNNs), Monte Carlo dropout, and deep ensembles have one thing in common: they are all based on some kind of mixture model. While the BNNs build infinite mixture models and are derived via variational inference, the latter two build finite mixtures trained with the maximum likelihood method. In this work we investigate the effect of training an infinite mixture distribution with the maximum likelihood method instead of variational inference. We find that the proposed objective leads to stochastic networks with an increased predictive variance, which improves uncertainty based identification of miss-classification and robustness against adversarial attacks in comparison to a standard BNN with equivalent network structure. The new model also displays higher entropy on out-of-distribution data.