Babble：通过电子图表和反统一学习更好的抽象

论文标题

Babble：通过电子图表和反统一学习更好的抽象

babble: Learning Better Abstractions with E-Graphs and Anti-Unification

论文作者

Cao, David, Kunkel, Rose, Nandi, Chandrakana, Willsey, Max, Tatlock, Zachary, Polikarpova, Nadia

论文摘要

图书馆学习通过将共同的结构从语料库提取到可重复使用的库功能中来压缩给定的程序。图书馆学习的先前工作遭受了两个限制，可以防止其扩展到更大，更复杂的输入。首先，它探讨了太多可用于压缩的候选库函数。其次，对于输入中的句法变化是不健壮的。我们提出了图书馆学习模型理论（LLMT），这是一种新的库学习算法，此外还将其作为给定问题域的方程式理论。 LLMT使用电子图表和平等饱和度来紧凑地表示该程序等效模型的空间，并使用一种新颖的电子绘制反统一技术来更直接有效地找到语料库中的常见模式。我们在名为Babble的工具中实施了LLMT。我们的评估表明，Babble的压缩顺序比最新的状态快。我们还提供了定性评估，表明Babble学习了以前无法实现图书馆学习的投入的可重复使用的功能。

Library learning compresses a given corpus of programs by extracting common structure from the corpus into reusable library functions. Prior work on library learning suffers from two limitations that prevent it from scaling to larger, more complex inputs. First, it explores too many candidate library functions that are not useful for compression. Second, it is not robust to syntactic variation in the input. We propose library learning modulo theory (LLMT), a new library learning algorithm that additionally takes as input an equational theory for a given problem domain. LLMT uses e-graphs and equality saturation to compactly represent the space of programs equivalent modulo the theory, and uses a novel e-graph anti-unification technique to find common patterns in the corpus more directly and efficiently. We implemented LLMT in a tool named BABBLE. Our evaluation shows that BABBLE achieves better compression orders of magnitude faster than the state of the art. We also provide a qualitative evaluation showing that BABBLE learns reusable functions on inputs previously out of reach for library learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题