性别偏见和对自动评估语法误差校正系统的对抗性攻击的通用替代攻击

论文标题

性别偏见和对自动评估语法误差校正系统的对抗性攻击的通用替代攻击

Gender Bias and Universal Substitution Adversarial Attacks on Grammatical Error Correction Systems for Automated Assessment

论文作者

Raina, Vyas, Gales, Mark

论文摘要

语法误差校正（GEC）系统执行序列到序列任务，其中GEC系统校正了包含语法错误的输入单词序列，以输出语法正确的单词序列。随着深度学习方法的出现，自动化的GEC系统变得越来越流行。例如，GEC系统通常用于英语学习者的语音转录作为评估和反馈形式 - 这些强大的GEC系统可用于自动测量候选人流利度的一个方面。 \ textIt {edits}的计数从候选人的输入句子（或文章）到GEC系统的语法纠正输出句子，这表明了候选人的语言能力，其中更少的编辑表明更好的流利性。因此，编辑计数可以看作是\ textit {fluency评分}，零表示完美的流利度。但是，尽管基于深度学习的GEC系统非常强大和准确，但它们容易受到对抗性攻击的影响：对手可以在系统的输入下引入一个小的，特定的更改，该系统在输出时会导致大型，不需要的变化。在考虑将GEC系统应用于自动化语言评估时，对手的目的可能是通过对语法上不正确的输入句子进行小改动来作弊，从而隐藏了GEC系统中的错误，因此没有找到编辑，并且候选人是不公正地授予了完美的流动性得分。这项工作研究了一种简单的通用替代攻击攻击，非母语的英语扬声器实际上可以用来欺骗用于评估的GEC系统。

Grammatical Error Correction (GEC) systems perform a sequence-to-sequence task, where an input word sequence containing grammatical errors, is corrected for these errors by the GEC system to output a grammatically correct word sequence. With the advent of deep learning methods, automated GEC systems have become increasingly popular. For example, GEC systems are often used on speech transcriptions of English learners as a form of assessment and feedback - these powerful GEC systems can be used to automatically measure an aspect of a candidate's fluency. The count of \textit{edits} from a candidate's input sentence (or essay) to a GEC system's grammatically corrected output sentence is indicative of a candidate's language ability, where fewer edits suggest better fluency. The count of edits can thus be viewed as a \textit{fluency score} with zero implying perfect fluency. However, although deep learning based GEC systems are extremely powerful and accurate, they are susceptible to adversarial attacks: an adversary can introduce a small, specific change at the input of a system that causes a large, undesired change at the output. When considering the application of GEC systems to automated language assessment, the aim of an adversary could be to cheat by making a small change to a grammatically incorrect input sentence that conceals the errors from a GEC system, such that no edits are found and the candidate is unjustly awarded a perfect fluency score. This work examines a simple universal substitution adversarial attack that non-native speakers of English could realistically employ to deceive GEC systems used for assessment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题