论文标题
通过logit校正避免伪造的相关性
Avoiding spurious correlations via logit correction
论文作者
论文摘要
实证研究表明,接受经验风险最小化(ERM)训练的机器学习模型通常依赖于可能与类标签相关的属性。这些模型通常会导致缺乏这种相关性的数据推断期间的性能差。在这项工作中,我们明确考虑了大多数培训数据中存在潜在的虚假相关性的情况。与现有的方法相比,使用ERM模型输出来检测样品而无需伪造的相关性以及启发性上升或将这些样品进行启发,我们提出了logit校正(LC)损失(LC)损失,这是一种简单而有效的改进,可以对软效果交叉透明度损失进行简单而有效的改进,以纠正样品logit。我们证明,将LC损失最小化与最大化群体平衡的精度相当,因此提出的LC可以减轻虚假相关性的负面影响。我们广泛的实验结果进一步表明,拟议的LC损失在多个流行的基准测试方面的最先进解决方案的较大边距(平均5.5 \%的绝对改进)而无法获得虚假属性标签。 LC还具有使用属性标签的Oracle方法竞争。代码可在https://github.com/shengliu66/lc上找到。
Empirical studies suggest that machine learning models trained with empirical risk minimization (ERM) often rely on attributes that may be spuriously correlated with the class labels. Such models typically lead to poor performance during inference for data lacking such correlations. In this work, we explicitly consider a situation where potential spurious correlations are present in the majority of training data. In contrast with existing approaches, which use the ERM model outputs to detect the samples without spurious correlations and either heuristically upweight or upsample those samples, we propose the logit correction (LC) loss, a simple yet effective improvement on the softmax cross-entropy loss, to correct the sample logit. We demonstrate that minimizing the LC loss is equivalent to maximizing the group-balanced accuracy, so the proposed LC could mitigate the negative impacts of spurious correlations. Our extensive experimental results further reveal that the proposed LC loss outperforms state-of-the-art solutions on multiple popular benchmarks by a large margin, an average 5.5\% absolute improvement, without access to spurious attribute labels. LC is also competitive with oracle methods that make use of the attribute labels. Code is available at https://github.com/shengliu66/LC.