论文标题
dollo-$ k $字符的系统发育概念的组合观点
Combinatorial perspectives on Dollo-$k$ characters in phylogenetics
论文作者
论文摘要
最近,具有持久字符的完美系统发育模型在文献中引起了极大的关注。这是基于这样的假设,即复杂的特征或角色只能获得一次,而在进化过程中损失了一次。在这里,我们考虑了该模型的概括,即Dollo Parsimony,它允许多个角色损失。更确切地说,我们对Dollo-$ k $字符的概念进行组合观点,即最多获得的特征,并且在整个演变过程中完全失去了$ k $ times。我们首先根据跨越子树的概念引入算法,该算法是为给定角色找到一个dollo-$ k $标签的算法,并在线性时间内为给定的树。然后,我们比较持久字符(由Dollo-0和Dollo-1字符的结合组成)和一般Dollo- $ K $字符。虽然众所周知,惠誉(Fitch)的简约和持久角色之间存在密切的联系,但我们表明,Dollo Parsimony和Fitch Parsimony通常截然不同。此外,虽然众所周知,持久字符的数量与树木平衡索引的sackin索引之间存在直接关系,但我们表明这种关系并未概括为dollo-$ k $字符。实际上,确定给定树的Dollo-$ k $字符的数量比计算持续的字符更重要,我们通过为前者引入递归方法来结束手稿。这种方法会导致多项式时间算法来计算Dollo-$ k $字符的数量,并且该算法以及用于计算Dollo-$ k $标签的算法都可以在Babel Package中用于Beast 2。
Recently, the perfect phylogeny model with persistent characters has attracted great attention in the literature. It is based on the assumption that complex traits or characters can only be gained once and lost once in the course of evolution. Here, we consider a generalization of this model, namely Dollo parsimony, that allows for multiple character losses. More precisely, we take a combinatorial perspective on the notion of Dollo-$k$ characters, i.e. traits that are gained at most once and lost precisely $k$ times throughout evolution. We first introduce an algorithm based on the notion of spanning subtrees for finding a Dollo-$k$ labeling for a given character and a given tree in linear time. We then compare persistent characters (consisting of the union of Dollo-0 and Dollo-1 characters) and general Dollo-$k$ characters. While it is known that there is a strong connection between Fitch parsimony and persistent characters, we show that Dollo parsimony and Fitch parsimony are in general very different. Moreover, while it is known that there is a direct relationship between the number of persistent characters and the Sackin index of a tree, a popular index of tree balance, we show that this relationship does not generalize to Dollo-$k$ characters. In fact, determining the number of Dollo-$k$ characters for a given tree is much more involved than counting persistent characters, and we end this manuscript by introducing a recursive approach for the former. This approach leads to a polynomial time algorithm for counting the number of Dollo-$k$ characters, and both this algorithm as well as the algorithm for computing Dollo-$k$ labelings are publicly available in the Babel package for BEAST 2.