论文标题
抗体设计的结构化Q学习
Structured Q-learning For Antibody Design
论文作者
论文摘要
优化组合结构是许多实际问题的核心,例如生命科学中遇到的问题。例如,抗体设计中涉及的关键步骤之一是在蛋白质序列中找到氨基酸的排列,以改善其与病原体的结合。由于极大的搜索空间和非线性目标,很难对抗体进行组合优化。即使对于适度的抗体设计问题,蛋白质的序列长度为11,我们也面临着超过2.05 x 10^14结构的搜索。应用传统的增强学习算法,例如Q-学习算法来组合优化,导致性能差。我们提出了结构化Q学习(SQL),这是Q学习的扩展,该Q学习结合了结构性先验,以进行组合优化。使用分子对接模拟器,我们证明了SQL可以找到高结合能序列,并在八个具有挑战性的抗体设计任务上对基准的表现有利,包括设计SARS-COV的抗体。
Optimizing combinatorial structures is core to many real-world problems, such as those encountered in life sciences. For example, one of the crucial steps involved in antibody design is to find an arrangement of amino acids in a protein sequence that improves its binding with a pathogen. Combinatorial optimization of antibodies is difficult due to extremely large search spaces and non-linear objectives. Even for modest antibody design problems, where proteins have a sequence length of eleven, we are faced with searching over 2.05 x 10^14 structures. Applying traditional Reinforcement Learning algorithms such as Q-learning to combinatorial optimization results in poor performance. We propose Structured Q-learning (SQL), an extension of Q-learning that incorporates structural priors for combinatorial optimization. Using a molecular docking simulator, we demonstrate that SQL finds high binding energy sequences and performs favourably against baselines on eight challenging antibody design tasks, including designing antibodies for SARS-COV.