使用正式验证的神经网络模型可证明的公平性

论文标题

使用正式验证的神经网络模型可证明的公平性

Provable Fairness for Neural Network Models using Formal Verification

论文作者

Borca-Tasciuc, Giorgian, Guo, Xingzhi, Bak, Stanley, Skiena, Steven

论文摘要

机器学习模型越来越多地用于关键的决策任务，这使得验证它们不包含从培训数据中获得的性别或种族偏见。通过对模型在评估数据中的公平性评估，围绕清洁或策划培训数据的努力的典型方法围绕着清洁或策划培训数据的努力。相反，我们使用最近开发的正式方法来验证神经网络模型的特性。BEYONDBEYOND对正式证明隐含的保证强度，我们的方法具有一个优势，即我们不需要明确的培训或评估数据（通常是专有的）来分析给定的训练有素的模型。在公平文献中两个熟悉的数据集（Compas和成人）的实验中，我们表明，通过适当的培训，我们可以平均不公平地减少65.4％\％，而AUC分数中的成本小于1 \％。

Machine learning models are increasingly deployed for critical decision-making tasks, making it important to verify that they do not contain gender or racial biases picked up from training data. Typical approaches to achieve fairness revolve around efforts to clean or curate training data, with post-hoc statistical evaluation of the fairness of the model on evaluation data. In contrast, we propose techniques to \emph{prove} fairness using recently developed formal methods that verify properties of neural network models.Beyond the strength of guarantee implied by a formal proof, our methods have the advantage that we do not need explicit training or evaluation data (which is often proprietary) in order to analyze a given trained model. In experiments on two familiar datasets in the fairness literature (COMPAS and ADULTS), we show that through proper training, we can reduce unfairness by an average of 65.4\% at a cost of less than 1\% in AUC score.

下载PDF全文

下载文献需遵守相关版权规定

论文标题