培训独立子网以进行健壮的预测

论文标题

培训独立子网以进行健壮的预测

Training independent subnetworks for robust prediction

论文作者

Havasi, Marton, Jenatton, Rodolphe, Fort, Stanislav, Liu, Jeremiah Zhe, Snoek, Jasper, Lakshminarayanan, Balaji, Dai, Andrew M., Tran, Dustin

论文摘要

有效整体神经网络的最新方法表明，通过与原始网络相比，参数的增益可忽略不计，可以实现强大的鲁棒性和不确定性性能。但是，这些方法仍然需要多个前向通行以进行预测，从而导致了巨大的计算成本。在这项工作中，我们显示出一个令人惊讶的结果：在单个模型的正向通行证下，可以“免费”使用多个预测的好处。特别是，我们表明，使用多输入多输出（MIMO）配置，可以利用单个模型的能力来训练多个子网，这些子网独立地学习了手头的任务。通过结合子网的预测，我们在不增加计算的情况下提高了模型鲁棒性。我们观察到与以前的方法相比，我们观察到CIFAR10，CIFAR100，Imagenet的负模样，准确性和校准误差及其分布式变体的校准误差。

Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network. However, these methods still require multiple forward passes for prediction, leading to a significant computational cost. In this work, we show a surprising result: the benefits of using multiple predictions can be achieved `for free' under a single model's forward pass. In particular, we show that, using a multi-input multi-output (MIMO) configuration, one can utilize a single model's capacity to train multiple subnetworks that independently learn the task at hand. By ensembling the predictions made by the subnetworks, we improve model robustness without increasing compute. We observe a significant improvement in negative log-likelihood, accuracy, and calibration error on CIFAR10, CIFAR100, ImageNet, and their out-of-distribution variants compared to previous methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题