论文标题
通过正规化打击基于信息的损失的不稳定
Combating the Instability of Mutual Information-based Losses via Regularization
论文作者
论文摘要
基于神经网络驱动的相互信息(MI)范围的机器学习领域已经取得了显着进展。但是,由于其实际和数学局限性,利用常规MI的损失通常是具有挑战性的。在这项工作中,我们首先确定其不稳定性背后的症状:(1)即使损失似乎收敛后,神经网络也不会融合,并且(2)饱和神经网络输出导致损失的分歧。我们通过在现有损失中添加一个新颖的正规化术语来减轻这两个问题。我们从理论上和实验上证明了增加正则化稳定训练。最后,我们提出了一种新颖的基准测试,该基准评估了MI估计功率及其在下游任务上的能力上的基于MI的损失,并遵循先前存在的监督和对比的学习环境。我们在多个基准上评估了六个不同的基于MI的损失及其正规化对应物,以表明我们的方法很简单却有效。
Notable progress has been made in numerous fields of machine learning based on neural network-driven mutual information (MI) bounds. However, utilizing the conventional MI-based losses is often challenging due to their practical and mathematical limitations. In this work, we first identify the symptoms behind their instability: (1) the neural network not converging even after the loss seemed to converge, and (2) saturating neural network outputs causing the loss to diverge. We mitigate both issues by adding a novel regularization term to the existing losses. We theoretically and experimentally demonstrate that added regularization stabilizes training. Finally, we present a novel benchmark that evaluates MI-based losses on both the MI estimation power and its capability on the downstream tasks, closely following the pre-existing supervised and contrastive learning settings. We evaluate six different MI-based losses and their regularized counterparts on multiple benchmarks to show that our approach is simple yet effective.