在深神经网络中的瓶颈上删除信息

论文标题

在深神经网络中的瓶颈上删除信息

Information Removal at the bottleneck in Deep Neural Networks

论文作者

Tartaglione, Enzo

论文摘要

如今，大量部署了深度学习模型，以解决各种各样的任务。通常，在利用“大数据”的可用性上，深度神经网络被训练为黑盒，从而最大程度地减少了其输出的目标函数。但是，这不允许通过模型（例如性别或种族）来控制某些特定特征的传播，以解决一些不相关的任务。这引发了隐私域中的问题（考虑到不需要的信息的传播）和偏见（考虑到这些功能有可能用于解决给定的任务）。在这项工作中，我们提出了艾琳（Irene），这是一种在深层神经网络的瓶颈上实现信息删除的方法，该方法明确地将要保留的特征之间的估计相互信息最小化了``私人''和目标。在合成数据集和Celeba上进行的实验验证了拟议方法的有效性，并为开发方法开发了确保深度神经网络中信息删除的方法。

Deep learning models are nowadays broadly deployed to solve an incredibly large variety of tasks. Commonly, leveraging over the availability of "big data", deep neural networks are trained as black-boxes, minimizing an objective function at its output. This however does not allow control over the propagation of some specific features through the model, like gender or race, for solving some an uncorrelated task. This raises issues either in the privacy domain (considering the propagation of unwanted information) and of bias (considering that these features are potentially used to solve the given task). In this work we propose IRENE, a method to achieve information removal at the bottleneck of deep neural networks, which explicitly minimizes the estimated mutual information between the features to be kept ``private'' and the target. Experiments on a synthetic dataset and on CelebA validate the effectiveness of the proposed approach, and open the road towards the development of approaches guaranteeing information removal in deep neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题