论文标题
使用明确的正则化和动态数据修剪来校准深层神经网络
Calibrating Deep Neural Networks using Explicit Regularisation and Dynamic Data Pruning
论文作者
论文摘要
深度神经网络(DNN)容易进行错误校准的预测,通常在预测的输出和相关的置信度评分之间表现出不匹配。当代模型校准技术通过推动获胜阶层的信心,同时增加所有测试样本中其余类的信心,从而减轻了过度自信的预测问题。但是,从部署的角度来看,需要一个理想的模型,以(i)对具有预测概率的高信心样本产生良好的预测,例如,(ii)产生更高比例的合法高信心样本。为此,我们提出了一种新型的正则化技术,该技术可以与分类损失一起使用,从而导致测试时进行最新的校准预测。从安全至关重要的应用中的部署角度来看,只有来自经过良好校准模型的高信心样本才是有趣的,因为其余样本必须进行手动检查。这些潜在的``高信心样本''的预测置信度降低是现有校准方法的缺点。我们通过提出动态的火车时间数据修剪策略来减轻这种情况,该策略每隔几个时期就会降低低调样本,从而增加“自信但经过校准的样本”。我们展示了跨图像分类基准的最新校准性能,从而减少了训练时间,而没有太多的准确性妥协。我们提供了有关为什么在测试时导致较低信心训练样本的动态修剪策略会导致高信样本增加的原因。
Deep neural networks (DNN) are prone to miscalibrated predictions, often exhibiting a mismatch between the predicted output and the associated confidence scores. Contemporary model calibration techniques mitigate the problem of overconfident predictions by pushing down the confidence of the winning class while increasing the confidence of the remaining classes across all test samples. However, from a deployment perspective, an ideal model is desired to (i) generate well-calibrated predictions for high-confidence samples with predicted probability say >0.95, and (ii) generate a higher proportion of legitimate high-confidence samples. To this end, we propose a novel regularization technique that can be used with classification losses, leading to state-of-the-art calibrated predictions at test time; From a deployment standpoint in safety-critical applications, only high-confidence samples from a well-calibrated model are of interest, as the remaining samples have to undergo manual inspection. Predictive confidence reduction of these potentially ``high-confidence samples'' is a downside of existing calibration approaches. We mitigate this by proposing a dynamic train-time data pruning strategy that prunes low-confidence samples every few epochs, providing an increase in "confident yet calibrated samples". We demonstrate state-of-the-art calibration performance across image classification benchmarks, reducing training time without much compromise in accuracy. We provide insights into why our dynamic pruning strategy that prunes low-confidence training samples leads to an increase in high-confidence samples at test time.