论文标题
进化场景中的相对时机信息和矫正
Relative Timing Information and Orthology in Evolutionary Scenarios
论文作者
论文摘要
进化场景描述了一个基因家族在一个集合中的演变,包括将基因树的顶点映射到物种树$ s $的顶点和边缘的基因顶点。两个现存基因的最后一个共同祖先的相对时机($ t $的叶子)和它们所居住的两个物种的最后一个共同祖先(叶子)表明水平基因转移(HGT)和古代重复。另一方面,直系同源基因对要求其最后一个共同的祖先与相应的物种事件一致。基因和物种差异的相对时间信息由三个有色的图形捕获,它们具有现有基因作为顶点以及将基因作为顶点颜色发现的基因的物种:相等的差异时间(EDT)图,较晚的divergence-divergence-divergence时间(LDT)图(LDT)图和先前的差异时间(PDT)图,形成了一个完整的图形分为图。 在这里,我们提供了可以从三个图中读取的信息和禁止的三联元的完整表征,并提供了一个多项式时间算法,用于构建一种说明图形的进化场景,提供了这样的方案。我们证明每个EDT图都是完美的。虽然有关LDT和PDT图的信息对于识别多项式时间的EDT图对于一般方案是必需的,但可以在无HGT的情况下删除此额外信息。但是,对于一般方案,对不了解推定LDT和PDT图的识别且不了解PDT图。相反,可以在多项式时间中识别PDT图。我们最终将EDT图连接到针对具有水平基因转移方案的矫正定义。除了一个例外,相应的图被证明是彩色的cographs。
Evolutionary scenarios describing the evolution of a family of genes within a collection of species comprise the mapping of the vertices of a gene tree $T$ to vertices and edges of a species tree $S$. The relative timing of the last common ancestors of two extant genes (leaves of $T$) and the last common ancestors of the two species (leaves of $S$) in which they reside is indicative of horizontal gene transfers (HGT) and ancient duplications. Orthologous gene pairs, on the other hand, require that their last common ancestors coincides with a corresponding speciation event. The relative timing information of gene and species divergences is captured by three colored graphs that have the extant genes as vertices and the species in which the genes are found as vertex colors: the equal-divergence-time (EDT) graph, the later-divergence-time (LDT) graph and the prior-divergence-time (PDT) graph, which together form an edge partition of the complete graph. Here we give a complete characterization in terms of informative and forbidden triples that can be read off the three graphs and provide a polynomial time algorithm for constructing an evolutionary scenario that explains the graphs, provided such a scenario exists. We show that every EDT graph is perfect. While the information about LDT and PDT graphs is necessary to recognize EDT graphs in polynomial-time for general scenarios, this extra information can be dropped in the HGT-free case. However, recognition of EDT graphs without knowledge of putative LDT and PDT graphs is NP-complete for general scenarios. In contrast, PDT graphs can be recognized in polynomial-time. We finally connect the EDT graph to the alternative definitions of orthology that have been proposed for scenarios with horizontal gene transfer. With one exception, the corresponding graphs are shown to be colored cographs.