Research  Open  Published:
Recursive algorithms for phylogenetic tree counting
Algorithms for Molecular Biologyvolume 8, Article number: 26 (2013)
Abstract
Background
In Bayesian phylogenetic inference we are interested in distributions over a space of trees. The number of trees in a tree space is an important characteristic of the space and is useful for specifying prior distributions. When all samples come from the same time point and no prior information available on divergence times, the tree counting problem is easy. However, when fossil evidence is used in the inference to constrain the tree or data are sampled serially, new tree spaces arise and counting the number of trees is more difficult.
Results
We describe an algorithm that is polynomial in the number of sampled individuals for counting of resolutions of a constraint tree assuming that the number of constraints is fixed. We generalise this algorithm to counting resolutions of a fully ranked constraint tree. We describe a quadratic algorithm for counting the number of possible fully ranked trees on n sampled individuals. We introduce a new type of tree, called a fully ranked tree with sampled ancestors, and describe a cubic time algorithm for counting the number of such trees on n sampled individuals.
Conclusions
These algorithms should be employed for Bayesian Markov chain Monte Carlo inference when fossil data are included or data are serially sampled.
Background
A phylogenetic tree is the common object of interest in many areas of biological science. The tree represents the ancestral relationships between a group of individuals. Given molecular sequence data sampled from a group of organisms it is possible to infer the historical relationships between these organisms using a statistical model of molecular evolution. At present, Bayesian Markov chain Monte Carlo (MCMC) methods are the dominant inferential tool for inferring molecular phylogenies [1].
It is a recent trend to include fossil evidence into the inference to obtain absolute estimates of divergence times [2, 3]. Fossils may restrict the age of the most resent common ancestor of a subgroup of individuals. This imposes a constraint on the tree topology (the discrete component of a genealogy) and therefore reduces the space of allowable genealogies.
Another trend in phylogenetic analyses is serial (or heterochronous) sampling in which molecular data is obtained from significantly different time points and analysed together. This type of data arises most frequently with ancient DNA and rapidly evolving pathogens [4]‐[6]. In this case tip dates become a part of the genealogy.
Including serially sampled or fossil data modifies or restricts the shape of a phylogenetic tree. Little has been done to describe and classify these modified trees. In this paper, we aim to explore the new spaces formed by these trees.
A genealogy consists of discrete and continuous components — the tree topology and the divergence times. The tree topologies form a finite tree space when the number of tips is bounded. An important characteristic of this space is the number of trees in it and we aim to find an efficient way to calculate this number.
In the case that fossil data restricts the tree topology, counting the number of trees that satisfy the imposed constraints reveals how much the constraints reduce the tree space.
The number of trees arises as a constant in tree prior distributions. Typically we model the distribution of tree topologies as independent of the distribution of divergence times. The density function of the distribution of genealogies is then a product of the density function for the divergence times and the distribution function for tree topologies. A common prior on tree topologies is uniform over all allowable topologies so the distribution function is a constant that is equal to one over the number of tree topologies. When inferring tree topologies using Bayesian MCMC methods, we do not usually need to know this constant but in some cases, as described below, the absolute value of the prior distribution is of interest and the constant has to be calculated.
When fossils are used to restrict the age of internal nodes, the tree prior should accurately account for this fact. Heled and Drummond [3] introduced a natural approach for tree prior specification when fossil evidence is employed in the inference. Their method requires counting of ranked phylogenetic trees that obey a number of constraints that arise from including the fossil evidence. The construction requires calculation of the marginal density for the time of the calibration node, the node representing the most recent common ancestor of a clade which may or may not be monophyletic. For a particular location of the calibration node, or particular constraints on the tree topology, the marginal density function is the marginal density function for the divergence times weighted by the number of trees satisfying the constraints. In this case, the weight constants do not cancel in the MCMC scheme and therefore have to be calculated.
Tree counting has a long history. For phylogenetic trees, the counting problem is to find the number of all possible trees on n leaves. For some types of phylogenetic tree, there are known closed form solutions to this problem. For other types, only recursive equations have been derived. In this paper, we consider only rooted trees.
A survey of results on counting different types of rooted trees is presented in [7] where trees with different combinations of the following properties are considered: trees are either labeled (only leaves are labeled) or unlabeled, ranked or non‐ranked, and bifurcating or multifurcating. The results presented in the survey can also be found in [6, 8, 9].
In [10], Griffiths considered unlabeled, non‐ranked rooted trees such that interior nodes can have one child or more and the root has at least two children. Using generating functions, he derived recursive equations for counting the number of all possible such trees on n leaves with s interior nodes. In [11], Felsenstein considered partially labeled trees, i.e., a tree in which all the leaves are labeled and some interior nodes also may be labeled. He derived the recursive equations for counting the number of rooted, non‐ranked, partially labeled trees with n labeled nodes.
In this paper, we consider a number of counting problems for different classes of phylogenetic trees. First, we describe an effective way of counting the number of all possible fully ranked trees on n leaves, that is, trees on n leaves in which all internal and leaf nodes are ranked.
Second, we find the number of bifurcating trees that resolve a given multifurcating tree with n leaves. We give a solution to this problem for rooted, ranked, labeled trees and generalise the algorithm to count resolutions to fully ranked trees.
Finally, we introduce and formally describe a new type of phylogenetic tree and describe an algorithm for counting the number of all such trees on n leaves. This type of tree is important when we have a serial sample and sampled individuals can be direct ancestors of later sampled individuals. When the population size is small or the fraction of individuals sampled from the population is large, this type of tree should be included in the inference [12, 13].
Serial sampling
We mainly follow the terminology from [9] for the definitions of phylogenetic trees. A tree is a finite connected undirected graph with no cycles. A rooted tree is a tree with a single node ρ designated as a root. Every rooted tree T=(V,E,ρ) imposes a partial order on V that is defined as follows: v_{1}≤_{ T }v_{2} if a unique simple path from the root to v_{2} passes through v_{1}. So the root is the smallest element. If v_{1}≤_{ T }v_{2} then we say that v_{1} is an ancestor of v_{2} and v_{2} is a descendant of v_{1}. A node in a rooted tree is called interior if it has descendants and a leaf if it has no descendants. The root is considered interior. Denote $\stackrel{\circ}{V}$ the set of interior nodes of T. A node u is a parent of a node v and v is a child of u if v<_{ T }u and there is no w∈V such that v<_{ T }w<_{ T }u. A rooted tree is called binary if every interior node has exactly two children. It is called weakly binary if every interior node has at most two children. We have chosen this terminology to fit with the usage of “binary” in the phylogenetics literature which may not agree with that in other literatures.
Let X be a finite non‐empty set of labels. A phylogenetic X‐tree is a pair $\mathcal{T}=(T,\varphi )$, where T is a tree and ϕ is a bijection from X onto the set of leaves of T (we may omit X and say “tree” instead of “X‐tree” if the set of labels is not specified). The tree T is called an underlying tree or a shape of the phylogenetic tree and ϕ is a labeling function. If the underlying tree of is rooted then is called a rooted phylogenetic tree. In what follows, we consider only rooted trees unless explicitly stated otherwise. A phylogenetic tree is binary (weak binary) if its underlying tree is binary (weak binary). A ranked phylogenetic tree is a pair $(\mathcal{T},h)$, where is a rooted phylogenetic tree and h is an injective function (ranking function) from the set $\stackrel{\circ}{V}$ into the set $\{1,\dots ,\stackrel{\circ}{V}\left\right\}$ such that v_{1}≤_{ T }v_{2} implies h(v_{1})≤h(v_{2}) for every ${v}_{1},{v}_{2}\in \stackrel{\circ}{V}$. In other words, there is a linear order on the interior nodes of T that is consistent with the partial order of T.
Definition 1
A ranked X‐tree is a binary ranked phylogenetic X‐tree.
An example of a ranked tree is given in Figure 1.
In biology, a phylogenetic tree represents the evolutionary history of a collection of sampled individuals. The collection of individuals is represented by the set X. The root of the tree is the most recent common ancestor of X and interior nodes are bifurcation events. The ranking function represents the time order of the bifurcation events. A general problem in evolutionary biology is how to reconstruct the phylogenetic tree from sequence data obtained from sampled individuals. Tackling this problem in a Bayesian framework may require counting the number of all possible histories on a sample of individuals.
When all individuals are sampled at the same time (as in Figure 1) counting tree problem has a simple solution.
Let X be a fixed label set such that X=n. The number of all ranked X‐trees up to isomorphism is
This formula has been derived by many authors. Proofs can be found in [6, 7], or [9]. The letter R in the equation comes from the word “ranked”.
The situation is different when individuals are sampled at different times (serially sampled). In this case, we need to define another kind of phylogenetic tree in which leaves are also ranked.
Definition 2
A fully ranked (FR) X‐tree is a pair$(\mathcal{T},h)$, whereis a binary rooted phylogenetic X‐tree and h:V→{1,…,l} with$\left\stackrel{\circ}{V}\right<l\le \leftV\right$is a surjective function such that

v_{1}≤_{ T }v_{2} implies h(v_{1})≤h(v_{2}) and

h(v_{1})=h(v_{2}) implies v_{1}=v_{2} or ${v}_{1},{v}_{2}\in V\setminus \stackrel{\circ}{V}$.
An example of a fully ranked X‐tree is given in Figure 2.
Before the tree is reconstructed we observe only leaves (sampled individuals) of the tree that are grouped (pre‐ranked) according to the times they were sampled. For the tree shown in Figure 2, we have two sampling times and hence two groups: A, B, and C form the first group, D and E form the second group.
Let $\mathfrak{T}=(\mathcal{T},h)$ be a fully ranked X‐tree with h:V→{1,…,l}. Let m=h(ϕ(X)), that is, the number of sampling times. Define a pre‐ranking function$\phantom{\rule{0.5em}{0ex}}\u0125$ from X onto {1,…,m} for tree $\phantom{\rule{0.5em}{0ex}}\mathfrak{T}$ such that for all x_{1},x_{2}∈X

h(ϕ(x_{1}))≤h(ϕ(x_{2})) implies $\u0125({x}_{1})\le \u0125({x}_{2})$ and

h(ϕ(x_{1}))=h(ϕ(x_{2})) iff $\u0125({x}_{1})=\u0125({x}_{2})$.
For the tree given in Figure 1, $\u0125(A)=\u0125(B)=\u0125(C)=1$ and $\u0125(D)=\u0125(E)=2$.
Let X and $\u0125:X\to \{1,\dots ,m\}$ be fixed. We are interested in the number of all fully ranked X‐trees that have$\phantom{\rule{0.5em}{0ex}}\u0125$ as a pre‐ranking function. Note that this number depends only on the numbers ${n}_{i}=\left\right\{x\u0125(x)=i\}$, the number of individuals sampled at the ith time point, not on X and $\phantom{\rule{0.5em}{0ex}}\u0125$ directly. We denote this quantity by F(n_{1},…,n_{ m }), where F stands for “fully ranked”. Then
and F(n)=R(n).
Proof
Consider a continuous process of bifurcation in which lineages may bifurcate in time or be cut and labeled (sampled). The process finishes when all lineages are cut producing a tree. The discrete structure of the tree produced by this process is a fully ranked X‐tree. It is easy to see that every fully ranked X‐tree can be obtained as a result of this process. To count the required number we can count the number of different trees which can be produced by the process if we know that after it finishes there are n_{ i } sampled individuals (i.e., cut and labeled lineages) at the ith time point, i.e., we have the sequence (n_{1},…,n_{ m }).
Suppose that at the (m−1)th time point there are i lineages that are ancestral to n_{ m } individuals sampled at time m. When we look at this process backwards in time the bifurcation events become coalescence events. The number of different ways these n_{ m } lineages coalesce to i lineages is $\frac{R({n}_{m})}{R(i)}$. This is the number of all possible ranked X‐trees on n_{ m } individuals but since we are not interested in the structure of the coalescent after we reach i lineages, it is divided by the number of ways in which the remaining i lineages can coalesce. Note that if coalescence patterns are different between the (m−1)th and mth time points then the trees are also different.
Further, for each of these coalescence patterns, we need to count the number of different ways these i lineages and other n_{1},…,n_{m−1} lineages can coalesce. This is where we can apply the recursion. We can consider that we also cut these i lineages at time m−1 and label them with the ranked subtrees descendant from these lineages. Then, at time m−1, we have n_{m−1} sampled individuals and another i sampled individuals and it remains to count the number of trees on the sequence (n_{1},…,n_{m−1}+i).
Note that two trees are different if they have different numbers of lineages at time m−1. The number of additional i lineages can be between 1 and n_{ m } and we need to sum over all possible i to complete the recursion. □
We introduce a third type of tree in which sampled individuals may be direct ancestors of later sampled individuals. We call it a tree with sampled ancestors. This type of tree is not usually considered in phylogenetics since the probability of sampling a direct ancestor is often negligible. In small populations or when a large portion of the population is sampled, however, this can not be ignored.
Let T=(V,E,ρ) be a weak binary tree. Define a set $\widehat{V}$ as follows:
A rooted S‐phylogenetic X‐tree is a pair $\mathcal{T}=(T,\varphi )$, where T is a weak binary tree and $\varphi :X\to \widehat{V}$ is a bijection.
Definition 3
A fully ranked X‐tree with sampled ancestors (FRS X‐tree) is a pair $(\mathcal{T},h)$, where is a rooted S‐phylogenetic X‐tree and h:V→{1,…,l} is a surjective function such that

v_{1}<_{ T }v_{2} implies h(v_{1})<h(v_{2}) and

h(v_{1})=h(v_{2}) implies v_{1}=v_{2} or ${v}_{1},{v}_{2}\in \widehat{V}$,
(see Figure 3).
The definition of a pre‐ranking function remains the same for FRS trees. Let S(n_{1},…,n_{ m }) (with S standing for ‘sampled ancestors’) denote the number of all FRS X‐trees that have the same pre‐ranking function $\phantom{\rule{0.5em}{0ex}}\u0125$, where ${n}_{i}=\left\right\{x\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\u0125(x)=i\}$. Then
and S(n)=R(n).
Proof
Consider the same process as before with only change of sampling events. Now, at some points of time, some lineages are cut and labeled and others are only labeled but not cut.
Then this equation can be obtained as follows. We have n_{ m } individuals that are sampled at the mth time point. At time m−1, there are between 1 and n_{ m } ancestral lineages of those n_{ m } individuals, depending on the number of coalescence between times m and m−1. Let i denote the number of these lineages. Then there are $\frac{R({n}_{m})}{R(i)}$ different possible coalescent patterns that can lead to this situation. Some of these i ancestral lineages may be among the individuals sampled at time m−1, i.e. lineages that are labeled but not cut at time m−1. Let j be the number of those ancestral lineages that are sampled at time m−1. There are $\left(\genfrac{}{}{0.0pt}{}{i}{j}\right)$ ways to chose these j lineages out of i and there are $\left(\genfrac{}{}{0.0pt}{}{{n}_{m1}}{j}\right)$ possible ways to chose j sampled at time m−1 individuals that are not cut at time m−1.
Further, at time m−1, there are n_{m−1} sampled lineages and i−j ancestral lineages that are not sampled and it remains to count the number of FRS trees on the sequence (n_{1},…,n_{m−1}+i−j).
Finally, we sum over all possible i and j to complete the recursion. □
Dynamic counting
Calculating the recursions (1) and (2) directly is inefficient and impractical. Here we describe a more efficient algorithm for counting fully ranked trees using these recursions. Rewrite equation (1) as
Then instead of calculating F(n_{1},…,n_{ j }+α) for j∈{1,…,m−1} and α∈{0,…,n_{j+1}+…+n_{ m }} we can calculate
using recurrence equations:
This leads to Algorithm 1 to calculate F(n_{1},…,n_{ m }). Let n be the number of sampled individuals, i.e., $n=\sum _{i=1}^{m}{n}_{i}$. Calculation of all the R(i) takes O(n) steps and calculation of all the A^{j} takes at most O(n) steps. In total, the algorithm takes O(m n) steps.
Algorithm 1 Calculating the number of fully ranked trees
A similar approach leads to an O(m n^{2})‐time algorithm for counting FRS trees. The description of this algorithm is in Appendix Appendix 1: Algorithm for counting FRS trees.
Constraints
In phylogenetic analysis, it is common to have some limited information about the ancestors of sampled individuals. We consider two types of such information. First, we may know that a subgroup of sampled individuals is monophyletic. That means that the most recent common ancestor of the subgroup is not an ancestor of any other individual that does not belong to the subgroup. Second, we may know the relative ages of the most recent common ancestors of monophyletic subgroups. This known information imposes constraints on the space of possible phylogenetic trees representing the evolutionary history of sampled individuals. The question is how many phylogenetic trees satisfy the constraints on a group of sampled individuals?
The number of resolutions of a constraint tree
We first describe a problem for contemporaneous sampling in terms of constraint trees. We call a rooted tree multifurcated if each interior node has at least 2 children. Note that in contrast to the common terminology we assume that a binary tree is also multifurcated. If we replace the word “binar” with “multifurcated” in the definition of a ranked tree we obtain a more general class of trees.
Definition 4
A constraint X‐tree is a multifurcated ranked phylogenetic X‐tree.
An example of a constraint tree is given in Figure 4. A constraint tree represents prior information about clades and ranking. Each interior node constrains a subgroup of individuals, leaves that are descendant from this node, to by monophyletic. The most recent common ancestor of the whole group of individuals, the root node, is also regarded as a constraint. The ranking function constrains the ages of the most recent common ancestors of the monophyletic subgroups to have a specified order.
We say that a ranked X‐tree ${\mathfrak{T}}_{1}=({\mathcal{T}}_{1},{h}_{1})$resolves a constraint X‐tree ${\mathfrak{T}}_{2}=({\mathcal{T}}_{2},{h}_{2})$ if there is an isomorphic embedding of ${\mathfrak{T}}_{2}$ into ${\mathfrak{T}}_{1}$, i.e., there is an injective mapping f:V_{ 2 }→V_{ 1 }such that

ϕ_{ 1 }(x)=f(ϕ_{ 2 }(x)) for each x∈X;

$$v{\le}_{{T}_{2}}u$$
iff $f(v){\le}_{{T}_{1}}f(u)$ for each u,v∈V_{ 2 }, and

h_{ 2 }(v)≤h_{ 2 }(u) implies h_{ 1 }(f(v))≤h_{ 1 }(f(u)) for each $u,v\in \underset{2}{\overset{\circ}{V}}$.
We wish to calculate the number of ranked trees that resolve a given constraint tree $\mathfrak{T}=(T,\varphi ,h)$. It is easy to see that this number depends only on the underlying tree T and ranking function h, but does not depend on the labeling function ϕ or the label set X.
We now introduce some notation in order to define recursive equations. We label interior nodes according to their ranks such that node i is a node v such that h(v)=i. A subtree induced by node i and its children is called subtree i. Child nodes in a subtree may be leaves in the initial tree. Let n_{ i }≥0 denote the number of such child nodes in subtree i. Let f:{2,…,k}→{1,…k−1} be the parent function on interior nodes, i.e., f(i)=j whenever j is a parent to i. When k=1, i.e., there is only one interior node, f=∅. See Figure 4 for an example of introduced notation. Note that a tuple (n_{ 1 },…,n_{ k },f) completely defines a pair (T,h).
Let R^{r}(n_{ 1 },…,n_{ k },f) be the number of ranked trees resolving a constraint tree defined by the tuple (n_{ 1 },…,n_{ k },f). The superscript r stands for “resolution”. Then the following equations hold.
where C is the collection of nodes that have more than 2 children and at least 2 of them are leaves, i.e., C={i<k  n_{ i }≥2 and n_{ i }+α_{ i }>2} and α_{ i }={a f(a)=i}. Note that n_{ i }+α_{ i } is the number of children of node i.
Proof
When a constraint tree has 2 leaves, it is unique and is resolution of itself. So Equation (6) is trivial. To explain the main sum in Equation (7) and (8) we consider the constraint tree which is defined by (3,3,{(2,1)}) and shown in Figure 5, left. The last interior node of a resolving tree (that is, the interior node with the highest rank or the furthest node from the root) is either a parent to leaves in subtree 1 or leaves in subtree 2. Suppose it is the first case (see Figure 5, centre). Since leaves have distinct labels from X, there are $\left(\genfrac{}{}{0.0pt}{}{3}{2}\right)$ ways to chose two leaves that are children of that last node. We can partition all the resolving trees for which the last node is in subtree 1 in $\left(\genfrac{}{}{0.0pt}{}{3}{2}\right)$ groups. The number of trees in each group is the number of trees that resolve a constraint tree defined by (2,3,{(2,1)}) and shown on the right of Figure 5. A similar argument holds if the last node in a resolving tree is a parent to nodes from subtree 2.
So in the general case, there are k subtrees and if the last node is in subtree i we have $\left(\genfrac{}{}{0.0pt}{}{{n}_{i}}{2}\right)$ ways to choose two lineages that coalesce and then we should count the number of trees resolving the tree defined by (n_{ 1 },…,n_{ i }−1,…,n_{ k },f). Note that in the example above, we consider the tree with more than 2 leaves in each subtree. However, the last interior node of a resolving tree can not be in subtree i for i<k if there is not enough leaves in this subtree. This can happen either if there are less than 2 leaves in subtree i or if there are 2 leaves in subtree i and node i has only these two leaves as its children. Both cases imply that any parent to leaves of subtree i in a resolving tree has a lesser rank than the rank of node k. This explains why we sum only over the elements of the set C.
Finally, we should consider one more case which explains why there is one more summand in equation (7). If the last node in a constraint tree has only two children, i.e., there are 2 leaves in subtree k, then there is one more group of resolving trees, the group that consists of resolving trees that have this node as the last node. □
Dynamic counting
We will calculate R^{r}(n_{ 1 },…,n_{ k },f) for the corresponding constraint tree. In order to find R^{r}(n_{ 1 },…,n_{ k },f), at each step s, we will calculate numbers ${R}^{r}({x}_{1},\dots ,{x}_{t},f{}_{{\mathbf{t}}_{1}})$ with $\sum _{i\le t}{x}_{i}=s$, t_{ −1 }={2,…,t}, and t≤k. Note that we do not have to calculate all such numbers. To determine which numbers are required we define two upper triangular matrices m and M of size k×k.
Suppose we draw a horizontal line which is strictly below the line that passes through node j and strictly above the line that passes through node j+1 (or all the leaves if j=k). Then m_{ i,j } is the minimal possible number of intersections of this line with branches of subtree i in a resolving tree and M_{ i,j } is the maximal possible number. An example is given in Figure 6.
Let a_{ i,j }={x≤j ∣ f(x)=i} for i≤j. So a_{ i,j } is the number of children of node i with ranks at most j. Then
Let t≤k and ${x}_{1},\dots ,{x}_{t}\in \mathbb{N}$. We call a tuple $({x}_{1},\dots ,{x}_{t},f{}_{{\mathbf{t}}_{1}})$ eligible if m_{ i,t }≤x_{ i }≤M_{ i,t } for 1≤i≤t.
We now turn to Algorithm 2 to count resolutions. At each step s≤n, we construct a set S_{ s }. A unique element of S_{ n } is R^{r}(n_{ 1 },…,n_{ k },f) and calculating elements of S_{ s } only requires elements of S_{ s−1 }.
Algorithm 2 Calculating the number of resolutions of a constraint tree
Proposition 1
When k is fixed Algorithm 2 does at most O(n^{k}) steps.
Proof
The algorithm does O(k) steps for each eligible tuple and, since we assume k is a constant, it is O(1). For given j there are
eligible tuples of size j. Since M_{ i,j }−m_{ i,j }+1≤n_{ i }+α_{ i }−1, for the total number of eligible tuples we have
□
The number of resolutions of a fully ranked constraint tree
We can generalise the results of the previous section to fully ranked trees. Now we replace the word “binary” with “multifurcated” in the definition of a fully ranked tree to get
Definition 5
A fully ranked constraint X‐tree is a pair $(\mathcal{T},h)$ , where is a multifurcated rooted phylogenetic X‐tree and h is a function such that h:V→{1,…,l}, where $\left\stackrel{\circ}{V}\right<l\le \leftV\right$ , and

v _{ 1 } ≤ _{ T } v _{ 2 } implies h(v _{ 1 } )≤h(v _{ 2 } ),

h(v_{ 1 })=h(v_{ 2 }) implies v_{ 1 }=v_{ 2 }or ${v}_{1},{v}_{2}\in V\setminus \stackrel{\circ}{V}$.
We say that a fully ranked X‐tree ${\mathfrak{T}}_{1}=({\mathcal{T}}_{1},{h}_{1})$resolves a fully ranked constraint X‐tree ${\mathfrak{T}}_{2}=({\mathcal{T}}_{2},{h}_{2})$ if there is an isomorphic embedding of ${\mathfrak{T}}_{2}$ into ${\mathfrak{T}}_{1}$, i.e., there is an injective mapping f:V_{ 2 }→V_{ 1 }such that

ϕ_{ 1 }(x)=f(ϕ_{ 2 }(x)) for each x∈X;

$$v{\le}_{{T}_{2}}u$$
iff $f(v){\le}_{{T}_{1}}f(u)$ for each u,v∈V_{ 2 },

h_{ 2 }(v)≤h_{ 2 }(u) implies h_{ 1 }(f(v))≤h_{ 1 }(f(u)) for each u,v∈V_{ 2 }, and

h_{ 2 }(v)=h_{ 2 }(u) iff h_{ 1 }(f(v))=h_{ 1 }(f(u)) for each u,v∈V_{ 2 }.
The problem is to count the number of fully ranked resolutions of a fully ranked constraint tree.
Let $\mathfrak{T}=(\mathcal{T},h)$ be a fully ranked constraint tree with h:V→{1,…,l} and$\left\stackrel{\circ}{V}\right=k$. Again we label interior nodes with numbers from {1,…,k}. However, labels and ranks of interior nodes do not necessary coincide now. Let $r:\{1,\dots ,k\}\to h(\stackrel{\circ}{V})$ be an injective increasing function which maps the kth interior node to its rank. Note that r(1) is always equal to 1 because the root node always has rank 1 and that r(k)<l because only leaves can have rank l. Now, node i is the node that has rank r(i) and subtree i is the subtree which is induced by node i and its children. Since leaves in subtrees are now ranked, we need more parameters to encode the tree. Let n= [ n_{i,j}] be a matrix of sizek×l, wheren_{i,j} is the number of leaves of rankj in subtreei or the number of children of nodei with rankj. Note that if nodei is a parent of nodej thenn_{i,r(j)}=1 andn_{i,x}=0 forx ≠ r(j) and this means that the parent function is uniquely defined by the matrix n and functionr and the number of resolutions of$\phantom{\rule{0.5em}{0ex}}\mathfrak{T}$ depends only on (n,r).
If a constraint tree has only 2 leaves then there is only 1 tree resolving it:
withr_{ 0 } mapping 1 to 1.
For the main recursion we need to determine the location of the last interior node in a resolving tree. As before, this node can be a parent to leaves in different subtrees. Also, since leaves now may have different ranks, the last interior node in a resolving tree may be ranked in different ways with respect to ranks of the leaves. LetN_{c,x} denote the number of children of nodec with ranks at leastx, that is, ${N}_{c,x}=\sum _{j=x}^{l}{n}_{c,j}$. For eachp∈{r(k)+1,…,l}, we define a setC^{p} of candidate subtrees in which the last node can be placed at levelp, i.e. between the (p−1)th andpth time points.
See Figure 7 for an example of subtree where the last node can be placed between the (p−1)th andpth time points.
For eachp andc such thatc∈C^{p}, there is a distinct group of resolving trees. The quantity of trees in this group is equal to the number of resolutions of a constraint tree which is defined by matrix ${\mathbf{n}}^{c,p}=\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}{n}_{i,j}^{c,p}\right]$ of sizek×p such that

$${n}_{i,j}^{c,p}={n}_{i,j}$$
for 1≤i≤k andj<p,

$${n}_{c,p}^{c,p}={N}_{c,p}1$$
,

$${n}_{i,p}^{c,p}={N}_{i,p}$$
fori≤k andi≠c.
If nodek has only 2 children then, as before, there is one more group of resolving trees. This group consists of the resolving trees in which the last interior node coincides with nodek. Let${\mathbf{n}}^{\prime}=\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}{n}_{i,j}^{\prime}\right]$ be a matrix of size (k−1)×r(k) such that

$${n}_{i,j}^{\prime}={n}_{i,j}$$
for 1≤i≤k andj<r(k),

$${n}_{i,r(k)}^{\prime}={N}_{i,r(k)}$$
for 1≤i≤k−1.
This matrix defines a constraint tree for which the number of resolutions is equal to the number of trees in the last group.
Then the main recursion is as follows:
Using these equations, we can calculateF^{r}(n,r) forO(m^{2}n^{k}) steps, wherem is the number of sampling points, i.e.,m=l−k. The calculation is described in Appendix Appendix 2: Algorithm for counting fully ranked resolutions of a fully ranked constraint tree.
Conclusions
In Table 1, we summarised the complexity (upper bounds) of the counting algorithms. We can see that counting fully ranked and FRS trees is reasonably fast, particularly when the number of sampling points is small. Counting of resolutions of a constraint tree is an expensive procedure but for smallk it remains possible. In practice,k is typically small so this algorithm will be of practical use. Counting resolutions of an FRS constraint tree, using the same methods, has a greater cost and is less elegant, but its practicality and bounds on its complexity remain to be assessed.
These algorithms can be implemented in software for phylogenetic analysis that involves serial sampling scheme or limited prior knowledge about ancestors of particular clades for calculating tree prior distributions.
Appendix 1: Algorithm for counting FRS trees
Rewrite equation (2) as follows:
where $A(i,a,b)=\sum _{x=0}^{\mathit{\text{min}}\{a,bi\}}\frac{R(b)}{R(i+x)}\left(\genfrac{}{}{0.0pt}{}{i+x}{x}\right)\left(\genfrac{}{}{0.0pt}{}{a}{x}\right)$. Then we can calculateS(n_{ 1 },…, n_{ m }) recursively using Algorithm 3. We use the notation ${N}_{j}=\sum _{i=j}^{m}{n}_{i}$.
Algorithm 3 Calculating the number of FRS trees
At each stepj>1 of the algorithm, we calculateS(n_{ 1 },…, n_{ j }+α) for 0≤α≤N_{j+1}. We skip stepj=1 and do not calculateS ( n_{ 1 }+ α) for 0≤α ≤ N_{ 2 } becauseS ( x )= R ( x) and we have already calculated all the necessaryR(x). Further, for calculating eachS(n_{ 1 },…, n_{ j }+ α), we need to calculate the coefficientsA and this is the most expensive part of the algorithm. So, at each stepj>1, we need to calculateA ( i , n_{j−1},n_{ j }+ α) for 0≤i ≤ n_{ j }+ α and 0≤α ≤ N_{j+1}.
First, we calculate these values forα=0. Denote
then
where 0≤i ≤ b,a=n_{j−1}, andb = n_{ j }. Wheni is fixed, calculation ofA ( i,n_{j−1},n_{ j }) requires values ofB(i,n_{j−1},n_{ j },x) for 0≤x ≤ m i n { n_{j−1},n_{ j }−i}. Rewrite this asB ( n_{ j }−β,n_{j−1},n_{ j },x) for 0≤x ≤ m i n{n_{j−1},β} withβ=n_{ j }−i. Now we can calculate these values for 0≤β≤n_{ j } and, hence, for 0≤i ≤ n_{ j } using recursive equations:
We use equation (11) for 0≤x≤m i n { n_{j−1},β}. Ifm i n{n_{j−1},β+1}≠m i n { n_{j−1},β} and thereforem i n { n_{j−1},β+1}=m i n { n_{j−1},β}+1=β+1 then we also use equation (12). The cost of this calculation is dominated by the number of the summandsB ( n_{ j }− β,n_{j−1},n_{ j },x) for 0≤x ≤ m i n { n_{j−1},β} and 0≤β ≤ n_{ j }. This number is $O({n}_{j}^{2})$.
Now we need to calculateA(0,n_{j−1},n_{ j }+α),…,A(n_{ j }+ α,n_{j−1},n_{ j }+ α) for 1≤α ≤ N_{j+1}. HavingA(0,n_{j−1},n_{ j }+α),…,A(n_{ j }+α,n_{j−1},n_{ j }+α) calculated (note that we have already calculated these values forα=0), we can calculateA(0,n_{j−1},n_{ j }+(α+1)),…,A(n_{ j }+(α+1),n_{j−1},n_{ j }+(α+1)) using equations
where 0≤i ≤ b,a = n_{j−1}, andb=n_{ j }+α.
Moreover, for eachα, we can optimize calculation of the second summands in the first case of equation (13) using recursion
To apply this recursion we need an initial value. This value depends onn_{j−1},n_{ j }, andα and, since only the first case of equation (13) contains the second summand, it is not necessary the value of this summand fori=0. There are 3 cases for calculation of all the initial values that are necessary at stepj.
Case 1: N_{ j }< n_{j−1}. That means that we use the first case of equation (13) for allα. In this case, for allα, the initial value for recursion (14) is the value of the second summand in (13) wheni=0. So we need to calculate $\left(\genfrac{}{}{0.0pt}{}{{n}_{j}+\alpha}{0}\right)\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+\alpha +0}\right)$ for 1≤α ≤ N_{j+1} and those are $\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+1}\right),\dots ,\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+{N}_{j+1}}\right)$. This takesO(N_{ j }) steps.
Case 2: n_{ j }+α_{ 0 }= n_{j−1} for someα_{ 0 }∈{0,…, N_{j+1}}. Ifα_{ 0 }>0 then, for allα ≤ α_{ 0 }, we use the first case of (13) and therefore we need to calculate $\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+1}\right),\dots ,\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+{\alpha}_{0}}\right)$, this takesO ( α_{ 0 }) steps. For allα > α_{ 0 }, we use the second case first and, starting fromi = α − α_{ 0 }, we use only the first case. So we need to calculate $\left(\genfrac{}{}{0.0pt}{}{{n}_{j}+\alpha}{\alpha {\alpha}_{0}}\right)\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+\alpha (\alpha {\alpha}_{0})}\right)$ forα∈{α_{ 0 }+1,…,N_{j+1}}, which are$\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}+1}{1}\right),\dots ,\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}+({N}_{j+1}{\alpha}_{0})}{({N}_{j+1}{\alpha}_{0})}\right)$. This takesO(N_{j+1}−α_{ 0 }) steps. In total, we haveO ( N_{j+1}) steps.
Case 3: n_{j−1}<n_{ j }. That means that, for allα, we use the second case first and, starting fromi = n_{ j }+ α − n_{j−1}, we use only the first case. Calculate$\left(\genfrac{}{}{0.0pt}{}{{n}_{j}+1}{{n}_{j1}}\right),\dots ,\left(\genfrac{}{}{0.0pt}{}{{n}_{j}+{N}_{j+1}}{{n}_{j1}}\right)$ forO ( n_{j−1}+N_{j+1}) steps.
Each case costs at mostO ( N_{ j }). Provided that the initial values are calculated, the cost of the rest calculation at stepj is dominated by the number of the coefficientsA ( i,n_{j−1},n_{ j }+ α) for 0≤i ≤ n_{ j }+ α and 1≤α ≤ N_{j+1} and it is$O({N}_{j}^{2})$. Summing up$O({N}_{j}^{2})$X,O ( N_{j−1}), and $O({N}_{j}^{2})$ gives us the cost of each stepj, which is $O({N}_{j}^{2})$. Since 1<j ≤ m and calculation ofR ( i) for 0≤i ≤ N_{ 1 }= n takesO ( n), the algorithm doesO ( m n^{2}) steps in total.
Appendix 2: Algorithm for counting fully ranked resolutions of a fully ranked constraint tree
The algorithm for counting resolutions of a constraint tree requires a few changes to count resolutions of a fully ranked constraint tree. Recall that, at each steps, we calculated the setS_{ s } which consists of the numbers of resolutions of intermediate trees withs leaves. To constructS_{ s }, for each element ofS_{s−1}, we proposed a collection of tuples that may define intermediate trees withs leaves. To accept eligible tuples we defined two matrices m and M. The general scheme for counting resolutions of a fully ranked constraint tree is the same. However, we need to define matrices m and M for a fully ranked constraint tree and describe the procedure of proposing new tuples. The procedure becomes more technical because of additional ranking.
We will calculateS^{r}(n,r) for the corresponding fully ranked constraint tree $\phantom{\rule{0.5em}{0ex}}\mathfrak{T}$. Let
for 1≤i ≤ kandr ( i )≤ j ≤ l−1, i.e.,a_{i,j} is the number of interior nodes that are children of nodei and have rank at mostj. Define two matrices m and M of sizek ×( l−1) such that
As before, if we consider a horizontal line that is strictly between the horizontal line that passes through the nodes of rankj and the horizontal line that passes through the nodes of rank (j+1) then these matrices determine the minimal and maximal possible numbers of intersections of this line with branches of subtreei.
Let x= [x_{i,j}] be a matrix of sizet × q, wheret ≤ k andq ≤ l, and t={1,…,t}. A tuple (x,r _{t}) is eligible if

x_{ i , j }x=n_{i,j} for 1≤i ≤ t and 1≤j < q, and

$${\mathtt{m}}_{i,q1}\le {x}_{q}^{i}\le {\mathtt{M}}_{i,q1}$$
for 1≤i ≤ t.
Having an intermediate tree withs−1 leaves, we need to consider all the possible ways to transform this tree to a tree withs leaves. First we need to add a new leaf to some subtree. We give the highest rank to this leaf. After that we may add new time points below the last time point. If we add new time points, we should rerank the leaves with the highest rank such that the new tree has enough leaves at each time point. Let x= [x_{i,j}] be a matrix of sizet × q fort ≤ k andq ≤ l. This matrix represents an intermediate tree withs−1 leaves. Suppose we add a new leaf to subtreei_{ 0 }, where 1≤i_{ 0 }≤ t. Letq_{ 0 } be the number of time points in a new tree and, therefore,q ≤ q_{ 0 }≤ r(t+1) (orq ≤ q_{ 0 }≤ l ift=k). Define a functiona d d o n e((x,r_{ t }),i_{ 0 }, q_{ 0 }) which returns a tuple $({\mathbf{x}}^{\prime},r{}_{\mathbf{t}})$, where${\mathbf{x}}^{\prime}=\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}{x}_{i,j}^{\prime}\right]$ is a matrix of sizet×q_{ 0 } such that

$${x}_{{i}_{0},{q}_{0}}^{\prime}={x}_{{i}_{0},q}{\sum}_{j=q}^{{q}_{0}1}{n}_{{i}_{0},j}+1$$
,

$${x}_{i,{q}_{0}}^{\prime}={x}_{i,q}{\sum}_{j=q}^{{q}_{0}1}{n}_{i,j}$$
fori≠i_{ 0 },

$${x}_{i,j}^{\prime}={x}_{i,j}$$
for 1≤i≤t andj<q, and

$${x}_{i,j}^{\prime}={n}_{i,j}$$
for 1≤i≤t andq≤j<q_{ 0 }.
If this tuple is eligible then it represents a new intermediate tree withs leaves.
At each steps of the algorithm, for each tuple (x,r _{ t }) such thatF^{r}(x,r _{ t })∈ S_{s−1}, we need to propose a collection of new tuples. The first part of this procedure is Algorithm 4. To stop the procedure, when we recognise that a new tree can not havex time points, we use a predicatee x t(x) which is true if we can extend the number of time points in a new tree tox. The number of time points in a new tree can not always be extended to any arbitrary number because there may not be enough leaves of the highest rank to rerank them in such a way that there will ben_{i,j} leaves of each new rankj. Soe x t(q) is always true because the tree to which we add a new leaf already hasq time points.
Algorithm 4 Working with a tuple (x,r_{ t }) such thatF^{r}(x,r _{ t })∈S_{s−1}(part 1)
Ift < k then the proposing procedure contains the second part. In this case, we also can increase the number of leaves in an intermediate tree by adding a new subtree. That means that one of the leaves of the tree becomes a parent to two new leaves. Again we may add new time points. Letq_{ 0 } be the number of time points in a new tree and, therefore,r(t+1)<q_{ 0 }≤ r(t+2) (orr(t+1)<q_{x0}≤ l whent+1=k). We define another functiona d d c o n s t r a i n t((x,r_{ t }),q_{ 0 }) which returns a tuple $({\mathbf{x}}^{\prime},r{}_{\mathbf{t}+1})$, where ${x}^{\prime}=\phantom{\rule{0.3em}{0ex}}\left[\phantom{\rule{0.3em}{0ex}}{x}_{i,j}^{\prime}\right]$ is a matrix of size (t+1)×q_{ 0 } such that

$${x}_{t+1,{q}_{0}}^{\prime}=2{\sum}_{j=r(t+1)+1}^{{q}_{0}1}{n}_{t+1,j}$$
,

$${x}_{t+1,j}^{\prime}={n}_{t+1,j}^{\prime}$$
for 1≤j<q_{ 0 },

$${x}_{i,{q}_{0}}^{\prime}={x}_{i,q}{\sum}_{j=q}^{{q}_{0}1}{n}_{i,j}$$
for 1≤i ≤ t,

$${x}_{i,j}^{\prime}={x}_{i,j}^{\prime}$$
for 1≤i ≤ t andj < q, and

$${x}_{i,j}^{\prime}={n}_{i,j}^{\prime}$$
for 1≤i ≤ t andq ≤ j < q_{ 0 }.
This tuple defines a new tree if it is eligible. The second part of the procedure is Algorithm 5. The values ofq_{ 0 } ande x t(q_{ 0 }) are as after running Algorithm 4.
Algorithm 5 Working with a tuple (x,r _{ t }) such thatF^{r}(x,r _{ t })∈S_{ s −1 }(part 2)
Finally, as before, the main algorithm calculates setsS_{ s } recursively for 0≤s ≤ n, wheren is the number of leaves in a constraint tree $\phantom{\rule{0.5em}{0ex}}\mathfrak{T}$, and a unique element ofS_{ n } isS^{r}(n,r).
Letm be a number of sampling points, i.e.,m=l−k.
Proposition 2
Ifk is fixed then the algorithm does at mostO(m) steps for each eligible tuple.
Proof
Let (x r _{ t }) be an eligible tuple. First, having all the necessary summands for equations (9) and (10), we need to calculateF^{r}(x,r _{ t }). This takes at mostO ( t ×[ r(t+1)−r(t)]) steps. Since we assumek is a constant and sincet≤k andr ( t+1)−r(t)≤m, it isO(m). Second, we need to perform the procedure of proposing new tuples for this tuple. The most expensive part here is calculation of the last columns of matrices ${\mathbf{x}}^{\prime}$ because it involves calculation of the sums ${\sum}_{j=q}^{{q}_{0}1}{n}_{i,j}$. Note that, storing the values of ${x}_{i,q}{\sum}_{j=q}^{{q}_{0}1}{n}_{i,j}$ for 1≤i ≤ t at each stepq_{ 0 }, we can optimise these calculations. Then it takesO(t×(q − r ( t+2)) steps and it isO ( m) again. □
Proposition 3
If k is fixed then the algorithm has O(^{m}n^{k}) time complexity.
Proof
Leti_{ x } be such a number thatr(i_{ x })≤x < r(i_{ x }+1). Then the number of eligible tuples is
Note that
2whereb_{ 1 }+…+ b_{ k }= n + k−1. Then
From this and Proposition 2, it follows that the algorithm does at mostO(m^{2}n^{k}) steps. □
References
 1.
Yang Z, Rannala B: Bayesian phylogenetic inference using DNA sequences: a Markov chain Monte Carlo method. Mol Biol Evol. 1997, 14 (7): 717724. 10.1093/oxfordjournals.molbev.a025811.
 2.
Yang Z, Rannala B: Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. Mol Biol Evol. 2005, 23: 212226. 10.1093/molbev/msj024.
 3.
Heled J, Drummond A: Tree priors for relaxed phylogenetics and divergence time estimation. Syst Biol. 2012, 61: 138149. 10.1093/sysbio/syr087.
 4.
Rodrigo AG, Felsenstein J: The Evolution of HIV. 1999, Baltimore: Johns Hopkins Univ Press
 5.
Drummond A, Pybus O, Rambaut A, Forsberg R, Rodrigo A: Measurably evolving populations. Trends Ecol Evol. 2003, 18: 481488. 10.1016/S01695347(03)002167.
 6.
Felsenstein J: Inferring Phylogenies. 2004, Sunderland: Sinauer Associates
 7.
Murtagh F: Counting dendograms: a survey. Discrete Appl Math. 1984, 7: 191199. 10.1016/0166218X(84)900660.
 8.
Gordon AD: A review of hierarchical classification. J R Stat Soc A. 1987, 150 (2): 119137. 10.2307/2981629.
 9.
Semple C, Steel M: Phylogenetics. 2003, New York: Oxford University Press
 10.
Griffiths RC: Counting genealogical trees. J Math Biol. 1987, 25: 423431. 10.1007/BF00277166.
 11.
Felsenstein J: The number of evolutionary trees. Syst Zool. 1978, 23: 2733.
 12.
Stadler T: Sampling‐through‐time in birth‐death trees. J Theor Biol. 2010, 267 (3): 396404. 10.1016/j.jtbi.2010.09.010.
 13.
Stadler T, Kouyos RD, von Wyl, Yerly S, Böni J, Bürgisser P, Klimkait T, Joos B, Rieder P, Xie D, Günthard HF, Drummond A, Bonhoeffer S, the Swiss HIV Cohort Study: Estimating the basic reproductive number from viral sequence data. Mol Biol Evol. 2012, 29: 347357. 10.1093/molbev/msr217.
Acknowledgements
AJD was partially funded by a Rutherford Discovery Fellowship from the Royal Society of New Zealand.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
The authors equally contributed to conceive the work. AG drafted the manuscript and DW and AJD revise it. All authors read and approved the final manuscript.
Rights and permissions
About this article
Received
Accepted
Published
DOI
Keywords
 Ranked tree
 Constraint tree
 Resolution
 Counting trees
 Dynamic algorithms
 Bayesian tree prior
 Phylogenetics