使用共同信息监视快捷方式学习

论文标题

使用共同信息监视快捷方式学习

Monitoring Shortcut Learning using Mutual Information

论文作者

Adnan, Mohammed, Ioannou, Yani, Tsai, Chuan-Yung, Galloway, Angus, Tizhoosh, H. R., Taylor, Graham W.

论文摘要

深度神经网络无法推广到分布数据是一个众所周知的问题，并引起了人们对在安全关键领域（例如医疗保健，金融和自动驾驶汽车）部署训练的网络的担忧。我们研究了一种特定的分销偏移$ \ unicode {x2013} $快捷方式或培训数据中的虚假相关性。快捷方式学习通常仅在对不包含相同伪造相关性的现实世界数据进行评估时才能暴露出来，这使AI从业者适当评估训练有素的模型对现实世界应用的有效性构成了严重的困境。在这项工作中，我们建议在学习的表示和输入之间使用共同信息（MI）作为指标，以查找训练中的位置，网络将锁定在快捷方式上。实验表明，MI可以用作监测快捷方式学习的域敏捷度量。

The failure of deep neural networks to generalize to out-of-distribution data is a well-known problem and raises concerns about the deployment of trained networks in safety-critical domains such as healthcare, finance and autonomous vehicles. We study a particular kind of distribution shift $\unicode{x2013}$ shortcuts or spurious correlations in the training data. Shortcut learning is often only exposed when models are evaluated on real-world data that does not contain the same spurious correlations, posing a serious dilemma for AI practitioners to properly assess the effectiveness of a trained model for real-world applications. In this work, we propose to use the mutual information (MI) between the learned representation and the input as a metric to find where in training, the network latches onto shortcuts. Experiments demonstrate that MI can be used as a domain-agnostic metric for monitoring shortcut learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题