论文标题
改善语义解析的组成概括
Improving Compositional Generalization in Semantic Parsing
论文作者
论文摘要
最近对分布(OOD)数据的概括最近引起了极大的关注。具体而言,组成的概括,即模型是否概括为在训练过程中观察到的组件构建的新结构,引发了极大的兴趣。在这项工作中,我们研究了语义解析中的组成概括,这是一种自然的组成概括,因为输出程序是由子组件构建的。我们分析了各种各样的模型,并向语义解析器的注意模块提出了多个扩展,旨在改善组成概括。我们发现以下因素改善了组成概括:(a)使用上下文表示,例如Elmo和Bert,(b)通知解码器以前已经介入的输入令牌,((c)培训解码器的注意以同意预先计算的令牌比对,以及(d)降低与频率编程相对频率的降低示例的示例。尽管我们大大减少了分布和OOD概括之间的差距,但OOD组成的性能仍然大大降低。
Generalization of models to out-of-distribution (OOD) data has captured tremendous attention recently. Specifically, compositional generalization, i.e., whether a model generalizes to new structures built of components observed during training, has sparked substantial interest. In this work, we investigate compositional generalization in semantic parsing, a natural test-bed for compositional generalization, as output programs are constructed from sub-components. We analyze a wide variety of models and propose multiple extensions to the attention module of the semantic parser, aiming to improve compositional generalization. We find that the following factors improve compositional generalization: (a) using contextual representations, such as ELMo and BERT, (b) informing the decoder what input tokens have previously been attended to, (c) training the decoder attention to agree with pre-computed token alignments, and (d) downsampling examples corresponding to frequent program templates. While we substantially reduce the gap between in-distribution and OOD generalization, performance on OOD compositions is still substantially lower.