基于投影剩余的机器学习方法

论文标题

基于投影剩余的机器学习方法

Machine Unlearning Method Based On Projection Residual

论文作者

Cao, Zihao, Wang, Jianzong, Si, Shijing, Huang, Zhangcheng, Xiao, Jing

论文摘要

机器学习模型（主要是神经网络）在现实生活中越来越多地使用。用户将其数据喂给模型进行培训。但是这些过程通常是单向的。训练后，该模型会记住数据。即使从数据集中删除数据，这些数据的效果仍然存在于模型中。随着世界各地越来越多的法律和法规保护数据隐私，使模型完全通过机器学习完全忘记了这些数据变得更加重要。本文采用了基于牛顿迭代方法的投影残差方法。主要目的是在线性回归模型和神经网络模型的背景下实现机器学习任务。该方法主要使用迭代加权方法来完全忘记数据及其相应的影响，其计算成本在数据的特征维度中是线性的。此方法可以改善当前的机器学习方法。同时，它独立于训练集的大小。通过特征注射测试（FIT）评估结果。实验表明，此方法在删除数据靠近模型重新培训方面更为彻底。

Machine learning models (mainly neural networks) are used more and more in real life. Users feed their data to the model for training. But these processes are often one-way. Once trained, the model remembers the data. Even when data is removed from the dataset, the effects of these data persist in the model. With more and more laws and regulations around the world protecting data privacy, it becomes even more important to make models forget this data completely through machine unlearning. This paper adopts the projection residual method based on Newton iteration method. The main purpose is to implement machine unlearning tasks in the context of linear regression models and neural network models. This method mainly uses the iterative weighting method to completely forget the data and its corresponding influence, and its computational cost is linear in the feature dimension of the data. This method can improve the current machine learning method. At the same time, it is independent of the size of the training set. Results were evaluated by feature injection testing (FIT). Experiments show that this method is more thorough in deleting data, which is close to model retraining.

下载PDF全文

下载文献需遵守相关版权规定

论文标题