了解人写的文本的迭代修订

论文标题

了解人写的文本的迭代修订

Understanding Iterative Revision from Human-Written Text

论文作者

Du, Wanyu, Raheja, Vipul, Kumar, Dhruv, Kim, Zae Myung, Lopez, Melissa, Kang, Dongyeop

论文摘要

从本质上讲，写作是一种战略性，自适应，更重要的是迭代过程。写作的关键部分是编辑和修改文本。先前关于文本修订的工作重点是定义单个域内的编辑意图分类法或开发具有单个编辑粒度的计算模型，例如句子级的编辑，这与人类的修订周期不同。这项工作描述了Iterater：第一个大规模的，多域的大规模编辑，注释的迭代修订文本语料库。特别是，根据一个新框架收集了迭代式，以全面地对迭代文本修订进行建模，这些修订概括为正式写作，编辑意图，修订深度和粒度的各个领域。当我们结合注释的编辑意图时，基于生成和基于编辑的文本修订模型都会显着改善自动评估。通过我们的工作，我们更好地理解了文本修订过程，在编辑意图和编写质量之间建立了至关重要的联系，从而使创建各种语料库以支持迭代文本修订的计算建模。

Writing is, by nature, a strategic, adaptive, and more importantly, an iterative process. A crucial part of writing is editing and revising the text. Previous works on text revision have focused on defining edit intention taxonomies within a single domain or developing computational models with a single level of edit granularity, such as sentence-level edits, which differ from human's revision cycles. This work describes IteraTeR: the first large-scale, multi-domain, edit-intention annotated corpus of iteratively revised text. In particular, IteraTeR is collected based on a new framework to comprehensively model the iterative text revisions that generalize to various domains of formal writing, edit intentions, revision depths, and granularities. When we incorporate our annotated edit intentions, both generative and edit-based text revision models significantly improve automatic evaluations. Through our work, we better understand the text revision process, making vital connections between edit intentions and writing quality, enabling the creation of diverse corpora to support computational modeling of iterative text revisions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题