论文标题
部分可观测时空混沌系统的无模型预测
Graph Pre-training for AMR Parsing and Generation
论文作者
论文摘要
抽象含义表示(AMR)突出显示了图结构中文本的核心语义信息。最近,预训练的语言模型(PLM)分别具有AMR解析和AMR到文本生成的高级任务。但是,PLM通常是在文本数据上预先训练的,因此对于建模结构知识而言是最佳的。为此,我们研究了图表自我监督训练,以提高PLM在AMR图上的结构意识。特别是,我们介绍了两种图形自动编码策略,用于图形前训练和四个任务,以在预训练期间集成文本和图形信息。我们进一步设计了一个统一的框架,以弥合预训练和微调任务之间的差距。对AMR解析和AMR到文本生成的实验表明了我们模型的优越性。据我们所知,我们是第一个考虑在语义图上进行预训练的人。
Abstract meaning representation (AMR) highlights the core semantic information of text in a graph structure. Recently, pre-trained language models (PLMs) have advanced tasks of AMR parsing and AMR-to-text generation, respectively. However, PLMs are typically pre-trained on textual data, thus are sub-optimal for modeling structural knowledge. To this end, we investigate graph self-supervised training to improve the structure awareness of PLMs over AMR graphs. In particular, we introduce two graph auto-encoding strategies for graph-to-graph pre-training and four tasks to integrate text and graph information during pre-training. We further design a unified framework to bridge the gap between pre-training and fine-tuning tasks. Experiments on both AMR parsing and AMR-to-text generation show the superiority of our model. To our knowledge, we are the first to consider pre-training on semantic graphs.