论文标题
通过距离意识确定性深度学习,简单而原则的不确定性估计
Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness
论文作者
论文摘要
贝叶斯神经网络(BNN)和深度集合是估计深度学习模型的预测不确定性的原则方法。但是,由于其沉重的记忆力和推理成本,它们在实时实时的工业规模应用中的实用性受到限制。这激发了我们研究仅需要一个深神经网络(DNN)的高质量不确定性估计方法的原则方法。通过将不确定性量化作为最小学习问题,我们首先识别输入距离意识,即模型量化测试示例与输入空间中训练数据的距离的能力,是DNN获得高质量(即最小值最佳)的不确定的必要条件。然后,我们提出了一种简单的方法来提高频谱归一化的神经高斯工艺(SNGP),该方法通过在训练过程中添加重量归一步步骤并用高斯工艺替换输出层,从而提高了现代DNN的距离意识能力。在一系列远见和语言理解任务以及现代体系结构(宽序列和BERT)上,SNGP具有竞争力,具有深度合奏,在预测,校准和域外检测方面具有胜过,并且胜过其他单模灯方法。
Bayesian neural networks (BNN) and deep ensembles are principled approaches to estimate the predictive uncertainty of a deep learning model. However their practicality in real-time, industrial-scale applications are limited due to their heavy memory and inference cost. This motivates us to study principled approaches to high-quality uncertainty estimation that require only a single deep neural network (DNN). By formalizing the uncertainty quantification as a minimax learning problem, we first identify input distance awareness, i.e., the model's ability to quantify the distance of a testing example from the training data in the input space, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs, by adding a weight normalization step during training and replacing the output layer with a Gaussian process. On a suite of vision and language understanding tasks and on modern architectures (Wide-ResNet and BERT), SNGP is competitive with deep ensembles in prediction, calibration and out-of-domain detection, and outperforms the other single-model approaches.