论文标题
Code4-struct:少量事件结构预测的代码生成
Code4Struct: Code Generation for Few-Shot Event Structure Prediction
论文作者
论文摘要
经过文本和代码的混合培训的大型语言模型(LLM)在将自然语言(NL)转化为结构化代码方面表现出了令人印象深刻的能力。我们观察到,语义结构可以方便地转换为代码,并提出Code4struct,以利用此类文本结构翻译能力来处理结构化的预测任务。作为案例研究,我们将事件参数提取(EAE)作为将文本转换为事件题目结构,可以使用代码将其表示为类对象。结构和代码之间的这种对齐使我们能够利用编程语言(PL)功能,例如继承和类型注释来引入外部知识或添加约束。我们表明,使用足够的文本示例,将EAE作为代码生成问题的配制与使用基于文本的提示的变体相比,具有优势。尽管仅对每种事件类型使用了20个培训事件实例,但Code4-Justruct却与受过4,202个实例训练的监督模型相当,并且胜过当前最新的(SOTA)对20次数据的培训的绝对F1训练。 Code4-Justruct可以使用来自同胞事件类型的10次摄影培训数据来预测零资源事件类型的参数,并优于零射击基线的绝对F1。
Large Language Model (LLM) trained on a mixture of text and code has demonstrated impressive capability in translating natural language (NL) into structured code. We observe that semantic structures can be conveniently translated into code and propose Code4Struct to leverage such text-to-structure translation capability to tackle structured prediction tasks. As a case study, we formulate Event Argument Extraction (EAE) as converting text into event-argument structures that can be represented as a class object using code. This alignment between structures and code enables us to take advantage of Programming Language (PL) features such as inheritance and type annotation to introduce external knowledge or add constraints. We show that, with sufficient in-context examples, formulating EAE as a code generation problem is advantageous over using variants of text-based prompts. Despite only using 20 training event instances for each event type, Code4Struct is comparable to supervised models trained on 4,202 instances and outperforms current state-of-the-art (SOTA) trained on 20-shot data by 29.5% absolute F1. Code4Struct can use 10-shot training data from a sibling event type to predict arguments for zero-resource event types and outperforms the zero-shot baseline by 12% absolute F1.