论文标题
圆形基因组的重排事件
Rearrangement Events on Circular Genomes
论文作者
论文摘要
关于基因组重排建模的早期文献将计算进化距离的问题视为固有的组合。特别是,请注意使用将一个基因组转化为另一个基因组所需的最小事件数量估算距离。事后看来,这种方法类似于从DNA序列(例如最大简约)中推断系统发育树的早期方法 - 两者都是由真实距离最小化进化变化的原理激发的,并且如果此原理是现实的真实反映,则两者都是有效的。最近的文献考虑了统计模型下的基因组重排,并与基于DNA的方法相似。这里的目的是使用基于模型的方法(例如最大似然技术)来计算距离估计的距离估计,以结合可以将一个基因组转化为另一种基因组的大量重排路径。至关重要的是,这种方法要求人们决定一组可行的重排事件,在本文中,我们专注于为签名的单染色体圆形基因组表征良好动机的模型,该区域数量保持固定。由于通常使用排列的数学描述重排,因此我们隔离了代表在这种情况下在生物学上合理的重排的排列集,例如反转和易位。我们为这些重排提供了精确的数学表达式,然后用应用程序在基因组中进行的一组切割来描述它们。我们将切割与断点进行比较,并使用此概念来计算应用给定数量剪切的独特重排动作。最后,我们提供了一些重新安排模型的例子,并讨论了定义合理模型时出现的一些问题。
Early literature on genome rearrangement modelling views the problem of computing evolutionary distances as an inherently combinatorial one. In particular, attention was given to estimating distances using the minimum number of events required to transform one genome into another. In hindsight, this approach is analogous to early methods for inferring phylogenetic trees from DNA sequences such as maximum parsimony -- both are motivated by the principle that the true distance minimises evolutionary change, and both are effective if this principle is a true reflection of reality. Recent literature considers genome rearrangement under statistical models, continuing this parallel with DNA-based methods; the goal here is to use model-based methods (for example maximum likelihood techniques) to compute distance estimates that incorporate the large number of rearrangement paths that can transform one genome into another. Crucially, this approach requires one to decide upon a set of feasible rearrangement events and, in this paper, we focus on characterising well-motivated models for signed, uni-chromosomal circular genomes, where the number of regions remains fixed. Since rearrangements are often mathematically described using permutations, we isolate the sets of permutations representing rearrangements that are biologically reasonable in this context, for example inversions and translocations. We provide precise mathematical expressions for these rearrangements, and then describe them in terms of the set of cuts made in the genome when they are applied. We directly compare cuts to breakpoints, and use this concept to count the distinct rearrangement actions which apply a given number of cuts. Finally, we provide some examples of rearrangement models, and include a discussion of some questions that arise when defining plausible models.