论文标题
部分可观测时空混沌系统的无模型预测
Are You Robert or RoBERTa? Deceiving Online Authorship Attribution Models Using Neural Text Generators
论文作者
论文摘要
最近,包括GPT-2,Grover和XLM在内的强大预训练的自然语言模型的发展有所增加。这些模型已显示出针对各种不同NLP任务的最新功能,包括问答,内容摘要和文本生成。除此之外,还有许多研究重点是在线作者归因(AA)。也就是说,使用模型来识别在线文本的作者。鉴于自然语言模型在生成令人信服的文本方面的力量,本文研究了这些语言模型可以在欺骗在线AA模型的文本中的程度。通过尝试博客和Twitter数据,我们使用GPT-2语言模型使用在线用户的现有帖子来生成文本。然后,我们检查这些基于AI的文本生成器是否能够模仿作者风格,以至于它们可以欺骗典型的AA模型。由此,我们发现当前基于AI的文本生成器能够成功模仿作者身份,在两个数据集上显示出对此的功能。反过来,我们的发现强调了强大的自然语言模型的当前能力,可以生成能够充分模仿作者风格的原始在线帖子,以欺骗流行的AA方法;鉴于AA在现实世界中提出的作用(例如垃圾邮件检测和法医研究)提出的作用,这是一个关键发现。
Recently, there has been a rise in the development of powerful pre-trained natural language models, including GPT-2, Grover, and XLM. These models have shown state-of-the-art capabilities towards a variety of different NLP tasks, including question answering, content summarisation, and text generation. Alongside this, there have been many studies focused on online authorship attribution (AA). That is, the use of models to identify the authors of online texts. Given the power of natural language models in generating convincing texts, this paper examines the degree to which these language models can generate texts capable of deceiving online AA models. Experimenting with both blog and Twitter data, we utilise GPT-2 language models to generate texts using the existing posts of online users. We then examine whether these AI-based text generators are capable of mimicking authorial style to such a degree that they can deceive typical AA models. From this, we find that current AI-based text generators are able to successfully mimic authorship, showing capabilities towards this on both datasets. Our findings, in turn, highlight the current capacity of powerful natural language models to generate original online posts capable of mimicking authorial style sufficiently to deceive popular AA methods; a key finding given the proposed role of AA in real world applications such as spam-detection and forensic investigation.