论文标题
CLOZER:适应性的数据增强,用于披风式阅读理解
Clozer: Adaptable Data Augmentation for Cloze-style Reading Comprehension
论文作者
论文摘要
任务自适应预训练(TAPT)减轻了缺乏标记的数据,并通过将未标记的数据调整为下游任务来提供性能提升。不幸的是,现有的改编主要涉及不能很好地概括的确定性规则。在这里,我们提出了Clozer,这是一种基于TAPT中的基于序列的固定答案提取方法,可扩展,以适应任何固定的机器读数理解理解(MRC)下游任务。我们尝试了多项选择的紧固式MRC任务,并表明与Oracle和最先进的TAPT在提升模型性能中的效果相比,Clozer的性能要好得多,并证明Clozer能够独立于任何启发式学识别金的答案。
Task-adaptive pre-training (TAPT) alleviates the lack of labelled data and provides performance lift by adapting unlabelled data to downstream task. Unfortunately, existing adaptations mainly involve deterministic rules that cannot generalize well. Here, we propose Clozer, a sequence-tagging based cloze answer extraction method used in TAPT that is extendable for adaptation on any cloze-style machine reading comprehension (MRC) downstream tasks. We experiment on multiple-choice cloze-style MRC tasks, and show that Clozer performs significantly better compared to the oracle and state-of-the-art in escalating TAPT effectiveness in lifting model performance, and prove that Clozer is able to recognize the gold answers independently of any heuristics.