论文标题
使用语言模型的自回归结构化预测
Autoregressive Structured Prediction with Language Models
论文作者
论文摘要
近年来,NLP的范式转变为使用验证的语言模型({PLM})进行多种任务。 但是,有许多困难的设计决策来代表结构(例如标记的文本,核心链),以便可以被PLM捕获。 与PLM的结构化预测进行的先前工作通常会使结构化输出变平为序列,这限制了所学习的结构信息的质量,并且与经典的判别模型相比导致了劣等性能。 在这项工作中,我们将模型结构的方法描述为以自回归方式使用PLM的作用序列,从而允许在没有任何损失的情况下学习结构依赖性。 我们的方法在我们查看的所有结构化预测任务上,即指定的实体识别,端到端关系提取和核心方案解决方案,实现了新的最新预测任务。
Recent years have seen a paradigm shift in NLP towards using pretrained language models ({PLM}) for a wide range of tasks. However, there are many difficult design decisions to represent structures (e.g. tagged text, coreference chains) in a way such that they can be captured by PLMs. Prior work on structured prediction with PLMs typically flattens the structured output into a sequence, which limits the quality of structural information being learned and leads to inferior performance compared to classic discriminative models. In this work, we describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs, allowing in-structure dependencies to be learned without any loss. Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at, namely, named entity recognition, end-to-end relation extraction, and coreference resolution.