论文标题
数据类型是语法引导基因编程的更符合人体工程学的前端
Data types as a more ergonomic frontend for Grammar-Guided Genetic Programming
论文作者
论文摘要
遗传编程(GP)是一种启发式方法,可以应用于许多机器学习,优化和工程问题。特别是,它已被广泛用于软件工程中,用于测试案例生成,程序合成和软件改进(GI)。 语法引导的基因编程(GGGP)方法使用户可以完善有效程序解决方案的领域。 Backus正常形式是描述GGGP的无上下文语法(CFG)的最受欢迎的界面。 BNF及其衍生工具的缺点是使语法语言和程序的目标语言交织在一起。 我们建议将语法嵌入到框架的主机语言中。当使用主机语言类型系统利用所有现有的工具时,这种方法具有与BNF和EBNF相同的表达能力:衬里,格式化器,类型检查器,自动完成和旧版代码支持。这些工具通常在设计软件,尤其是GP系统方面具有实用性。 我们还提出了元手机,用户定义的树生成系统的覆盖率。与现有的CFG方法相比,该技术以更实用性和表达能力扩展了我们面向对象的编码,从而实现了属性语法的表达能力,但没有语法与目标语言二元性。 此外,我们证明这种方法是可行的,显示了一个例子python实施为证明。我们还将我们的方法与文本BNF代表W.R.T.进行了比较。表达力和人体工程学。这些优势并非以绩效为代价,如我们对针对小马示例实施的5个基准的经验评估所示。我们得出的结论是,我们的方法具有更好的人体工程学,具有基于文本BNF的语法编码的表达能力和性能相同。
Genetic Programming (GP) is an heuristic method that can be applied to many Machine Learning, Optimization and Engineering problems. In particular, it has been widely used in Software Engineering for Test-case generation, Program Synthesis and Improvement of Software (GI). Grammar-Guided Genetic Programming (GGGP) approaches allow the user to refine the domain of valid program solutions. Backus Normal Form is the most popular interface for describing Context-Free Grammars (CFG) for GGGP. BNF and its derivatives have the disadvantage of interleaving the grammar language and the target language of the program. We propose to embed the grammar as an internal Domain-Specific Language in the host language of the framework. This approach has the same expressive power as BNF and EBNF while using the host language type-system to take advantage of all the existing tooling: linters, formatters, type-checkers, autocomplete, and legacy code support. These tools have a practical utility in designing software in general, and GP systems in particular. We also present Meta-Handlers, user-defined overrides of the tree-generation system. This technique extends our object-oriented encoding with more practicability and expressive power than existing CFG approaches, achieving the same expressive power of Attribute Grammars, but without the grammar vs target language duality. Furthermore, we evidence that this approach is feasible, showing an example Python implementation as proof. We also compare our approach against textual BNF-representations w.r.t. expressive power and ergonomics. These advantages do not come at the cost of performance, as shown by our empirical evaluation on 5 benchmarks of our example implementation against PonyGE2. We conclude that our approach has better ergonomics with the same expressive power and performance of textual BNF-based grammar encodings.