论文标题
样本效率问题:实用分子优化的基准
Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization
论文作者
论文摘要
分子优化是化学科学中的一个基本目标,是药物和材料设计的核心意义。近年来,在解决计算分子优化各个方面的具有挑战性的问题方面取得了重大进展,强调了高有效性,多样性以及最近的合成性。尽管取得了这种进展,但许多论文报告了琐碎或自我设计的任务的结果,这给直接评估新方法的性能带来了其他挑战。此外,尽管对现实发现应用是一个必不可少的考虑因素,但很少讨论优化的样本效率(由甲骨文评估的分子数量)。 为了填补这一空白,我们为实用分子优化创建了一个开源基准PMO,以促进分子优化算法进步的透明且可再现的评估。本文彻底研究了25种分子设计算法在23项任务上的性能,特别关注样品效率。我们的结果表明,在有限的Oracle预算允许10K查询的情况下,大多数“最先进”方法无法胜过其前辈,并且在这种情况下,没有现有算法可以有效地解决某些分子优化问题。我们分析了优化算法选择,分子组装策略和甲骨文景观对优化性能的影响,以告知未来算法的开发和基准测试。 PMO提供了标准化的实验设置,以全面评估和比较新分子优化方法与现有方法。所有代码均可在https://github.com/wenhao-gao/mol_opt上找到。
Molecular optimization is a fundamental goal in the chemical sciences and is of central interest to drug and material design. In recent years, significant progress has been made in solving challenging problems across various aspects of computational molecular optimizations, emphasizing high validity, diversity, and, most recently, synthesizability. Despite this progress, many papers report results on trivial or self-designed tasks, bringing additional challenges to directly assessing the performance of new methods. Moreover, the sample efficiency of the optimization--the number of molecules evaluated by the oracle--is rarely discussed, despite being an essential consideration for realistic discovery applications. To fill this gap, we have created an open-source benchmark for practical molecular optimization, PMO, to facilitate the transparent and reproducible evaluation of algorithmic advances in molecular optimization. This paper thoroughly investigates the performance of 25 molecular design algorithms on 23 tasks with a particular focus on sample efficiency. Our results show that most "state-of-the-art" methods fail to outperform their predecessors under a limited oracle budget allowing 10K queries and that no existing algorithm can efficiently solve certain molecular optimization problems in this setting. We analyze the influence of the optimization algorithm choices, molecular assembly strategies, and oracle landscapes on the optimization performance to inform future algorithm development and benchmarking. PMO provides a standardized experimental setup to comprehensively evaluate and compare new molecule optimization methods with existing ones. All code can be found at https://github.com/wenhao-gao/mol_opt.