论文标题
如何重新处理技巧打破了差异私人的文本表示学习
How reparametrization trick broke differentially-private text representation learning
论文作者
论文摘要
随着NLP社区中隐私的吸引力,研究人员已经开始采用各种保护隐私方法的方法。由于其基本的理论保证,最喜欢的隐私框架之一,差异隐私(DP)也许是最引人注目的。尽管差异隐私的一般概念显然很简单,但在将其应用于NLP时使它正确,但它似乎并非平凡。在这篇简短的论文中,我们正式分析了使用dptext的文本表示学习的最近的几篇NLP论文(Beigi等,2019a,b; Alnasser等,2021; Beigi等人,2021年),并揭示了他们虚假的私密主张。此外,我们还展示了一项简单而普遍的经验理智检查,以确定给定的DP机制的实施是否肯定违反了隐私损失保证。我们的主要目标是提高认识并帮助社区了解将差异隐私应用于文本表示学习的潜在陷阱。
As privacy gains traction in the NLP community, researchers have started adopting various approaches to privacy-preserving methods. One of the favorite privacy frameworks, differential privacy (DP), is perhaps the most compelling thanks to its fundamental theoretical guarantees. Despite the apparent simplicity of the general concept of differential privacy, it seems non-trivial to get it right when applying it to NLP. In this short paper, we formally analyze several recent NLP papers proposing text representation learning using DPText (Beigi et al., 2019a,b; Alnasser et al., 2021; Beigi et al., 2021) and reveal their false claims of being differentially private. Furthermore, we also show a simple yet general empirical sanity check to determine whether a given implementation of a DP mechanism almost certainly violates the privacy loss guarantees. Our main goal is to raise awareness and help the community understand potential pitfalls of applying differential privacy to text representation learning.