测量忘记记忆的训练示例

论文标题

测量忘记记忆的训练示例

Measuring Forgetting of Memorized Training Examples

论文作者

Jagielski, Matthew, Thakkar, Om, Tramèr, Florian, Ippolito, Daphne, Lee, Katherine, Carlini, Nicholas, Wallace, Eric, Song, Shuang, Thakurta, Abhradeep, Papernot, Nicolas, Zhang, Chiyuan

论文摘要

机器学习模型表现出两个看似矛盾的现象：训练数据记忆和各种遗忘形式。在记忆中，模型过于适合特定的培训示例，并容易受到隐私攻击的影响。在忘记时，最后出现在培训早期出现的例子被遗忘了。在这项工作中，我们将这些现象联系起来。我们提出了一种技术，以衡量模型在多大程度上“忘记”培训示例的细节，对他们最近未曾见过的示例的隐私攻击不易受到隐私攻击的影响。我们表明，尽管非凸模型可以在最差的标准图像，语音和语言模型中永远记住数据，但随着时间的流逝，凭经验确实忘记了示例。我们将非确定性识别为潜在的解释，表明经过确定性训练的模型不会忘记。我们的结果表明，当使用极大的数据集培训（例如用于预训练A模型的示例）时，早期看到的例子可能会观察到隐私益处，而牺牲了后来的示例。

Machine learning models exhibit two seemingly contradictory phenomena: training data memorization, and various forms of forgetting. In memorization, models overfit specific training examples and become susceptible to privacy attacks. In forgetting, examples which appeared early in training are forgotten by the end. In this work, we connect these phenomena. We propose a technique to measure to what extent models "forget" the specifics of training examples, becoming less susceptible to privacy attacks on examples they have not seen recently. We show that, while non-convex models can memorize data forever in the worst-case, standard image, speech, and language models empirically do forget examples over time. We identify nondeterminism as a potential explanation, showing that deterministically trained models do not forget. Our results suggest that examples seen early when training with extremely large datasets - for instance those examples used to pre-train a model - may observe privacy benefits at the expense of examples seen later.

下载PDF全文

下载文献需遵守相关版权规定

论文标题