Recursive algorithms for phylogenetic tree counting
 Alexandra Gavryushkina^{1}Email author,
 David Welch^{1} and
 Alexei J Drummond^{1, 2}
https://doi.org/10.1186/17487188826
© Gavryushkina et al.; licensee BioMed Central Ltd. 2013
Received: 15 February 2013
Accepted: 8 October 2013
Published: 28 October 2013
Abstract
Background
In Bayesian phylogenetic inference we are interested in distributions over a space of trees. The number of trees in a tree space is an important characteristic of the space and is useful for specifying prior distributions. When all samples come from the same time point and no prior information available on divergence times, the tree counting problem is easy. However, when fossil evidence is used in the inference to constrain the tree or data are sampled serially, new tree spaces arise and counting the number of trees is more difficult.
Results
We describe an algorithm that is polynomial in the number of sampled individuals for counting of resolutions of a constraint tree assuming that the number of constraints is fixed. We generalise this algorithm to counting resolutions of a fully ranked constraint tree. We describe a quadratic algorithm for counting the number of possible fully ranked trees on n sampled individuals. We introduce a new type of tree, called a fully ranked tree with sampled ancestors, and describe a cubic time algorithm for counting the number of such trees on n sampled individuals.
Conclusions
These algorithms should be employed for Bayesian Markov chain Monte Carlo inference when fossil data are included or data are serially sampled.
Keywords
Background
A phylogenetic tree is the common object of interest in many areas of biological science. The tree represents the ancestral relationships between a group of individuals. Given molecular sequence data sampled from a group of organisms it is possible to infer the historical relationships between these organisms using a statistical model of molecular evolution. At present, Bayesian Markov chain Monte Carlo (MCMC) methods are the dominant inferential tool for inferring molecular phylogenies [1].
It is a recent trend to include fossil evidence into the inference to obtain absolute estimates of divergence times [2, 3]. Fossils may restrict the age of the most resent common ancestor of a subgroup of individuals. This imposes a constraint on the tree topology (the discrete component of a genealogy) and therefore reduces the space of allowable genealogies.
Another trend in phylogenetic analyses is serial (or heterochronous) sampling in which molecular data is obtained from significantly different time points and analysed together. This type of data arises most frequently with ancient DNA and rapidly evolving pathogens [4]‐[6]. In this case tip dates become a part of the genealogy.
Including serially sampled or fossil data modifies or restricts the shape of a phylogenetic tree. Little has been done to describe and classify these modified trees. In this paper, we aim to explore the new spaces formed by these trees.
A genealogy consists of discrete and continuous components — the tree topology and the divergence times. The tree topologies form a finite tree space when the number of tips is bounded. An important characteristic of this space is the number of trees in it and we aim to find an efficient way to calculate this number.
In the case that fossil data restricts the tree topology, counting the number of trees that satisfy the imposed constraints reveals how much the constraints reduce the tree space.
The number of trees arises as a constant in tree prior distributions. Typically we model the distribution of tree topologies as independent of the distribution of divergence times. The density function of the distribution of genealogies is then a product of the density function for the divergence times and the distribution function for tree topologies. A common prior on tree topologies is uniform over all allowable topologies so the distribution function is a constant that is equal to one over the number of tree topologies. When inferring tree topologies using Bayesian MCMC methods, we do not usually need to know this constant but in some cases, as described below, the absolute value of the prior distribution is of interest and the constant has to be calculated.
When fossils are used to restrict the age of internal nodes, the tree prior should accurately account for this fact. Heled and Drummond [3] introduced a natural approach for tree prior specification when fossil evidence is employed in the inference. Their method requires counting of ranked phylogenetic trees that obey a number of constraints that arise from including the fossil evidence. The construction requires calculation of the marginal density for the time of the calibration node, the node representing the most recent common ancestor of a clade which may or may not be monophyletic. For a particular location of the calibration node, or particular constraints on the tree topology, the marginal density function is the marginal density function for the divergence times weighted by the number of trees satisfying the constraints. In this case, the weight constants do not cancel in the MCMC scheme and therefore have to be calculated.
Tree counting has a long history. For phylogenetic trees, the counting problem is to find the number of all possible trees on n leaves. For some types of phylogenetic tree, there are known closed form solutions to this problem. For other types, only recursive equations have been derived. In this paper, we consider only rooted trees.
A survey of results on counting different types of rooted trees is presented in [7] where trees with different combinations of the following properties are considered: trees are either labeled (only leaves are labeled) or unlabeled, ranked or non‐ranked, and bifurcating or multifurcating. The results presented in the survey can also be found in [6, 8, 9].
In [10], Griffiths considered unlabeled, non‐ranked rooted trees such that interior nodes can have one child or more and the root has at least two children. Using generating functions, he derived recursive equations for counting the number of all possible such trees on n leaves with s interior nodes. In [11], Felsenstein considered partially labeled trees, i.e., a tree in which all the leaves are labeled and some interior nodes also may be labeled. He derived the recursive equations for counting the number of rooted, non‐ranked, partially labeled trees with n labeled nodes.
In this paper, we consider a number of counting problems for different classes of phylogenetic trees. First, we describe an effective way of counting the number of all possible fully ranked trees on n leaves, that is, trees on n leaves in which all internal and leaf nodes are ranked.
Second, we find the number of bifurcating trees that resolve a given multifurcating tree with n leaves. We give a solution to this problem for rooted, ranked, labeled trees and generalise the algorithm to count resolutions to fully ranked trees.
Finally, we introduce and formally describe a new type of phylogenetic tree and describe an algorithm for counting the number of all such trees on n leaves. This type of tree is important when we have a serial sample and sampled individuals can be direct ancestors of later sampled individuals. When the population size is small or the fraction of individuals sampled from the population is large, this type of tree should be included in the inference [12, 13].
Serial sampling
We mainly follow the terminology from [9] for the definitions of phylogenetic trees. A tree is a finite connected undirected graph with no cycles. A rooted tree is a tree with a single node ρ designated as a root. Every rooted tree T=(V,E,ρ) imposes a partial order on V that is defined as follows: v_{1}≤_{ T }v_{2} if a unique simple path from the root to v_{2} passes through v_{1}. So the root is the smallest element. If v_{1}≤_{ T }v_{2} then we say that v_{1} is an ancestor of v_{2} and v_{2} is a descendant of v_{1}. A node in a rooted tree is called interior if it has descendants and a leaf if it has no descendants. The root is considered interior. Denote $\stackrel{\circ}{V}$ the set of interior nodes of T. A node u is a parent of a node v and v is a child of u if v<_{ T }u and there is no w∈V such that v<_{ T }w<_{ T }u. A rooted tree is called binary if every interior node has exactly two children. It is called weakly binary if every interior node has at most two children. We have chosen this terminology to fit with the usage of “binary” in the phylogenetics literature which may not agree with that in other literatures.
Let X be a finite non‐empty set of labels. A phylogenetic X‐tree is a pair $\mathcal{T}=(T,\varphi )$, where T is a tree and ϕ is a bijection from X onto the set of leaves of T (we may omit X and say “tree” instead of “X‐tree” if the set of labels is not specified). The tree T is called an underlying tree or a shape of the phylogenetic tree and ϕ is a labeling function. If the underlying tree of is rooted then is called a rooted phylogenetic tree. In what follows, we consider only rooted trees unless explicitly stated otherwise. A phylogenetic tree is binary (weak binary) if its underlying tree is binary (weak binary). A ranked phylogenetic tree is a pair $(\mathcal{T},h)$, where is a rooted phylogenetic tree and h is an injective function (ranking function) from the set $\stackrel{\circ}{V}$ into the set $\{1,\dots ,\stackrel{\circ}{V}\left\right\}$ such that v_{1}≤_{ T }v_{2} implies h(v_{1})≤h(v_{2}) for every ${v}_{1},{v}_{2}\in \stackrel{\circ}{V}$. In other words, there is a linear order on the interior nodes of T that is consistent with the partial order of T.
Definition 1
A ranked X‐tree is a binary ranked phylogenetic X‐tree.
In biology, a phylogenetic tree represents the evolutionary history of a collection of sampled individuals. The collection of individuals is represented by the set X. The root of the tree is the most recent common ancestor of X and interior nodes are bifurcation events. The ranking function represents the time order of the bifurcation events. A general problem in evolutionary biology is how to reconstruct the phylogenetic tree from sequence data obtained from sampled individuals. Tackling this problem in a Bayesian framework may require counting the number of all possible histories on a sample of individuals.
When all individuals are sampled at the same time (as in Figure 1) counting tree problem has a simple solution.
This formula has been derived by many authors. Proofs can be found in [6, 7], or [9]. The letter R in the equation comes from the word “ranked”.
The situation is different when individuals are sampled at different times (serially sampled). In this case, we need to define another kind of phylogenetic tree in which leaves are also ranked.
Definition 2
A fully ranked (FR) X‐tree is a pair$(\mathcal{T},h)$, where is a binary rooted phylogenetic X‐tree and h:V→{1,…,l} with$\left\stackrel{\circ}{V}\right<l\le \leftV\right$is a surjective function such that

v_{1}≤_{ T }v_{2} implies h(v_{1})≤h(v_{2}) and

h(v_{1})=h(v_{2}) implies v_{1}=v_{2} or ${v}_{1},{v}_{2}\in V\setminus \stackrel{\circ}{V}$.
Before the tree is reconstructed we observe only leaves (sampled individuals) of the tree that are grouped (pre‐ranked) according to the times they were sampled. For the tree shown in Figure 2, we have two sampling times and hence two groups: A, B, and C form the first group, D and E form the second group.
Let $\mathfrak{T}=(\mathcal{T},h)$ be a fully ranked X‐tree with h:V→{1,…,l}. Let m=h(ϕ(X)), that is, the number of sampling times. Define a pre‐ranking function$\phantom{\rule{0.5em}{0ex}}\u0125$ from X onto {1,…,m} for tree $\phantom{\rule{0.5em}{0ex}}\mathfrak{T}$ such that for all x_{1},x_{2}∈X

h(ϕ(x_{1}))≤h(ϕ(x_{2})) implies $\u0125({x}_{1})\le \u0125({x}_{2})$ and

h(ϕ(x_{1}))=h(ϕ(x_{2})) iff $\u0125({x}_{1})=\u0125({x}_{2})$.
For the tree given in Figure 1, $\u0125(A)=\u0125(B)=\u0125(C)=1$ and $\u0125(D)=\u0125(E)=2$.
and F(n)=R(n).
Proof
Consider a continuous process of bifurcation in which lineages may bifurcate in time or be cut and labeled (sampled). The process finishes when all lineages are cut producing a tree. The discrete structure of the tree produced by this process is a fully ranked X‐tree. It is easy to see that every fully ranked X‐tree can be obtained as a result of this process. To count the required number we can count the number of different trees which can be produced by the process if we know that after it finishes there are n_{ i } sampled individuals (i.e., cut and labeled lineages) at the ith time point, i.e., we have the sequence (n_{1},…,n_{ m }).
Suppose that at the (m−1)th time point there are i lineages that are ancestral to n_{ m } individuals sampled at time m. When we look at this process backwards in time the bifurcation events become coalescence events. The number of different ways these n_{ m } lineages coalesce to i lineages is $\frac{R({n}_{m})}{R(i)}$. This is the number of all possible ranked X‐trees on n_{ m } individuals but since we are not interested in the structure of the coalescent after we reach i lineages, it is divided by the number of ways in which the remaining i lineages can coalesce. Note that if coalescence patterns are different between the (m−1)th and mth time points then the trees are also different.
Further, for each of these coalescence patterns, we need to count the number of different ways these i lineages and other n_{1},…,n_{m−1} lineages can coalesce. This is where we can apply the recursion. We can consider that we also cut these i lineages at time m−1 and label them with the ranked subtrees descendant from these lineages. Then, at time m−1, we have n_{m−1} sampled individuals and another i sampled individuals and it remains to count the number of trees on the sequence (n_{1},…,n_{m−1}+i).
Note that two trees are different if they have different numbers of lineages at time m−1. The number of additional i lineages can be between 1 and n_{ m } and we need to sum over all possible i to complete the recursion. □
We introduce a third type of tree in which sampled individuals may be direct ancestors of later sampled individuals. We call it a tree with sampled ancestors. This type of tree is not usually considered in phylogenetics since the probability of sampling a direct ancestor is often negligible. In small populations or when a large portion of the population is sampled, however, this can not be ignored.
A rooted S‐phylogenetic X‐tree is a pair $\mathcal{T}=(T,\varphi )$, where T is a weak binary tree and $\varphi :X\to \widehat{V}$ is a bijection.
Definition 3

v_{1}<_{ T }v_{2} implies h(v_{1})<h(v_{2}) and

h(v_{1})=h(v_{2}) implies v_{1}=v_{2} or ${v}_{1},{v}_{2}\in \widehat{V}$,
and S(n)=R(n).
Proof
Consider the same process as before with only change of sampling events. Now, at some points of time, some lineages are cut and labeled and others are only labeled but not cut.
Then this equation can be obtained as follows. We have n_{ m } individuals that are sampled at the mth time point. At time m−1, there are between 1 and n_{ m } ancestral lineages of those n_{ m } individuals, depending on the number of coalescence between times m and m−1. Let i denote the number of these lineages. Then there are $\frac{R({n}_{m})}{R(i)}$ different possible coalescent patterns that can lead to this situation. Some of these i ancestral lineages may be among the individuals sampled at time m−1, i.e. lineages that are labeled but not cut at time m−1. Let j be the number of those ancestral lineages that are sampled at time m−1. There are $\left(\genfrac{}{}{0.0pt}{}{i}{j}\right)$ ways to chose these j lineages out of i and there are $\left(\genfrac{}{}{0.0pt}{}{{n}_{m1}}{j}\right)$ possible ways to chose j sampled at time m−1 individuals that are not cut at time m−1.
Further, at time m−1, there are n_{m−1} sampled lineages and i−j ancestral lineages that are not sampled and it remains to count the number of FRS trees on the sequence (n_{1},…,n_{m−1}+i−j).
Finally, we sum over all possible i and j to complete the recursion. □
Dynamic counting
This leads to Algorithm 1 to calculate F(n_{1},…,n_{ m }). Let n be the number of sampled individuals, i.e., $n=\sum _{i=1}^{m}{n}_{i}$. Calculation of all the R(i) takes O(n) steps and calculation of all the A^{ j } takes at most O(n) steps. In total, the algorithm takes O(m n) steps.
Algorithm 1 Calculating the number of fully ranked trees
A similar approach leads to an O(m n^{2})‐time algorithm for counting FRS trees. The description of this algorithm is in Appendix Appendix 1: Algorithm for counting FRS trees.
Constraints
In phylogenetic analysis, it is common to have some limited information about the ancestors of sampled individuals. We consider two types of such information. First, we may know that a subgroup of sampled individuals is monophyletic. That means that the most recent common ancestor of the subgroup is not an ancestor of any other individual that does not belong to the subgroup. Second, we may know the relative ages of the most recent common ancestors of monophyletic subgroups. This known information imposes constraints on the space of possible phylogenetic trees representing the evolutionary history of sampled individuals. The question is how many phylogenetic trees satisfy the constraints on a group of sampled individuals?
The number of resolutions of a constraint tree
We first describe a problem for contemporaneous sampling in terms of constraint trees. We call a rooted tree multifurcated if each interior node has at least 2 children. Note that in contrast to the common terminology we assume that a binary tree is also multifurcated. If we replace the word “binar” with “multifurcated” in the definition of a ranked tree we obtain a more general class of trees.
Definition 4
A constraint X‐tree is a multifurcated ranked phylogenetic X‐tree.
We say that a ranked X‐tree ${\mathfrak{T}}_{1}=({\mathcal{T}}_{1},{h}_{1})$resolves a constraint X‐tree ${\mathfrak{T}}_{2}=({\mathcal{T}}_{2},{h}_{2})$ if there is an isomorphic embedding of ${\mathfrak{T}}_{2}$ into ${\mathfrak{T}}_{1}$, i.e., there is an injective mapping f:V_{ 2 }→V_{ 1 }such that

ϕ_{ 1 }(x)=f(ϕ_{ 2 }(x)) for each x∈X;

$v{\le}_{{T}_{2}}u$
iff $f(v){\le}_{{T}_{1}}f(u)$ for each u,v∈V_{ 2 }, and

h_{ 2 }(v)≤h_{ 2 }(u) implies h_{ 1 }(f(v))≤h_{ 1 }(f(u)) for each $u,v\in \underset{2}{\overset{\circ}{V}}$.
We wish to calculate the number of ranked trees that resolve a given constraint tree $\mathfrak{T}=(T,\varphi ,h)$. It is easy to see that this number depends only on the underlying tree T and ranking function h, but does not depend on the labeling function ϕ or the label set X.
We now introduce some notation in order to define recursive equations. We label interior nodes according to their ranks such that node i is a node v such that h(v)=i. A subtree induced by node i and its children is called subtree i. Child nodes in a subtree may be leaves in the initial tree. Let n_{ i }≥0 denote the number of such child nodes in subtree i. Let f:{2,…,k}→{1,…k−1} be the parent function on interior nodes, i.e., f(i)=j whenever j is a parent to i. When k=1, i.e., there is only one interior node, f=∅. See Figure 4 for an example of introduced notation. Note that a tuple (n_{ 1 },…,n_{ k },f) completely defines a pair (T,h).
where C is the collection of nodes that have more than 2 children and at least 2 of them are leaves, i.e., C={i<k  n_{ i }≥2 and n_{ i }+α_{ i }>2} and α_{ i }={a f(a)=i}. Note that n_{ i }+α_{ i } is the number of children of node i.
Proof
So in the general case, there are k subtrees and if the last node is in subtree i we have $\left(\genfrac{}{}{0.0pt}{}{{n}_{i}}{2}\right)$ ways to choose two lineages that coalesce and then we should count the number of trees resolving the tree defined by (n_{ 1 },…,n_{ i }−1,…,n_{ k },f). Note that in the example above, we consider the tree with more than 2 leaves in each subtree. However, the last interior node of a resolving tree can not be in subtree i for i<k if there is not enough leaves in this subtree. This can happen either if there are less than 2 leaves in subtree i or if there are 2 leaves in subtree i and node i has only these two leaves as its children. Both cases imply that any parent to leaves of subtree i in a resolving tree has a lesser rank than the rank of node k. This explains why we sum only over the elements of the set C.
Finally, we should consider one more case which explains why there is one more summand in equation (7). If the last node in a constraint tree has only two children, i.e., there are 2 leaves in subtree k, then there is one more group of resolving trees, the group that consists of resolving trees that have this node as the last node. □
Dynamic counting
We will calculate R^{ r }(n_{ 1 },…,n_{ k },f) for the corresponding constraint tree. In order to find R^{ r }(n_{ 1 },…,n_{ k },f), at each step s, we will calculate numbers ${R}^{r}({x}_{1},\dots ,{x}_{t},f{}_{{\mathbf{t}}_{1}})$ with $\sum _{i\le t}{x}_{i}=s$, t_{ −1 }={2,…,t}, and t≤k. Note that we do not have to calculate all such numbers. To determine which numbers are required we define two upper triangular matrices m and M of size k×k.
Let t≤k and ${x}_{1},\dots ,{x}_{t}\in \mathbb{N}$. We call a tuple $({x}_{1},\dots ,{x}_{t},f{}_{{\mathbf{t}}_{1}})$ eligible if m_{ i,t }≤x_{ i }≤M_{ i,t } for 1≤i≤t.
We now turn to Algorithm 2 to count resolutions. At each step s≤n, we construct a set S_{ s }. A unique element of S_{ n } is R^{ r }(n_{ 1 },…,n_{ k },f) and calculating elements of S_{ s } only requires elements of S_{ s−1 }.
Algorithm 2 Calculating the number of resolutions of a constraint tree
Proposition 1
When k is fixed Algorithm 2 does at most O(n^{ k }) steps.
Proof
□
The number of resolutions of a fully ranked constraint tree
We can generalise the results of the previous section to fully ranked trees. Now we replace the word “binary” with “multifurcated” in the definition of a fully ranked tree to get
Definition 5
A fully ranked constraint X‐tree is a pair $(\mathcal{T},h)$ , where is a multifurcated rooted phylogenetic X‐tree and h is a function such that h:V→{1,…,l}, where $\left\stackrel{\circ}{V}\right<l\le \leftV\right$ , and

v _{ 1 } ≤ _{ T } v _{ 2 } implies h(v _{ 1 } )≤h(v _{ 2 } ),

h(v_{ 1 })=h(v_{ 2 }) implies v_{ 1 }=v_{ 2 }or ${v}_{1},{v}_{2}\in V\setminus \stackrel{\circ}{V}$.
We say that a fully ranked X‐tree ${\mathfrak{T}}_{1}=({\mathcal{T}}_{1},{h}_{1})$resolves a fully ranked constraint X‐tree ${\mathfrak{T}}_{2}=({\mathcal{T}}_{2},{h}_{2})$ if there is an isomorphic embedding of ${\mathfrak{T}}_{2}$ into ${\mathfrak{T}}_{1}$, i.e., there is an injective mapping f:V_{ 2 }→V_{ 1 }such that

ϕ_{ 1 }(x)=f(ϕ_{ 2 }(x)) for each x∈X;

$v{\le}_{{T}_{2}}u$
iff $f(v){\le}_{{T}_{1}}f(u)$ for each u,v∈V_{ 2 },

h_{ 2 }(v)≤h_{ 2 }(u) implies h_{ 1 }(f(v))≤h_{ 1 }(f(u)) for each u,v∈V_{ 2 }, and

h_{ 2 }(v)=h_{ 2 }(u) iff h_{ 1 }(f(v))=h_{ 1 }(f(u)) for each u,v∈V_{ 2 }.
The problem is to count the number of fully ranked resolutions of a fully ranked constraint tree.
Let $\mathfrak{T}=(\mathcal{T},h)$ be a fully ranked constraint tree with h:V→{1,…,l} and$\left\stackrel{\circ}{V}\right=k$. Again we label interior nodes with numbers from {1,…,k}. However, labels and ranks of interior nodes do not necessary coincide now. Let $r:\{1,\dots ,k\}\to h(\stackrel{\circ}{V})$ be an injective increasing function which maps the kth interior node to its rank. Note that r(1) is always equal to 1 because the root node always has rank 1 and that r(k)<l because only leaves can have rank l. Now, node i is the node that has rank r(i) and subtree i is the subtree which is induced by node i and its children. Since leaves in subtrees are now ranked, we need more parameters to encode the tree. Let n= [ n_{i,j}] be a matrix of sizek×l, wheren_{i,j} is the number of leaves of rankj in subtreei or the number of children of nodei with rankj. Note that if nodei is a parent of nodej thenn_{i,r(j)}=1 andn_{i,x}=0 forx ≠ r(j) and this means that the parent function is uniquely defined by the matrix n and functionr and the number of resolutions of$\phantom{\rule{0.5em}{0ex}}\mathfrak{T}$ depends only on (n,r).
withr_{ 0 } mapping 1 to 1.

${n}_{i,j}^{c,p}={n}_{i,j}$
for 1≤i≤k andj<p,

,${n}_{c,p}^{c,p}={N}_{c,p}1$

${n}_{i,p}^{c,p}={N}_{i,p}$
fori≤k andi≠c.

${n}_{i,j}^{\prime}={n}_{i,j}$
for 1≤i≤k andj<r(k),

${n}_{i,r(k)}^{\prime}={N}_{i,r(k)}$
for 1≤i≤k−1.
This matrix defines a constraint tree for which the number of resolutions is equal to the number of trees in the last group.
Using these equations, we can calculateF^{ r }(n,r) forO(m^{ 2 }n^{k}) steps, wherem is the number of sampling points, i.e.,m=l−k. The calculation is described in Appendix Appendix 2: Algorithm for counting fully ranked resolutions of a fully ranked constraint tree.
Conclusions
The complexity of algorithms
No constraints ( k=1)  k constraints  

Contemporaneous  
sampling  O(n)  O(n^{ k }) 
(m=1)  
Serial sampling  
with no sampled  O(m n)  O(m^{ 2 }n^{ k }) 
ancestors  
Serial sampling  
with sampled  O(m n^{ 2 })  ‐ 
ancestors 
These algorithms can be implemented in software for phylogenetic analysis that involves serial sampling scheme or limited prior knowledge about ancestors of particular clades for calculating tree prior distributions.
Appendix 1: Algorithm for counting FRS trees
where $A(i,a,b)=\sum _{x=0}^{\mathit{\text{min}}\{a,bi\}}\frac{R(b)}{R(i+x)}\left(\genfrac{}{}{0.0pt}{}{i+x}{x}\right)\left(\genfrac{}{}{0.0pt}{}{a}{x}\right)$. Then we can calculateS(n_{ 1 },…, n_{ m }) recursively using Algorithm 3. We use the notation ${N}_{j}=\sum _{i=j}^{m}{n}_{i}$.
Algorithm 3 Calculating the number of FRS trees
At each stepj>1 of the algorithm, we calculateS(n_{ 1 },…, n_{ j }+α) for 0≤α≤N_{j+1}. We skip stepj=1 and do not calculateS ( n_{ 1 }+ α) for 0≤α ≤ N_{ 2 } becauseS ( x )= R ( x) and we have already calculated all the necessaryR(x). Further, for calculating eachS(n_{ 1 },…, n_{ j }+ α), we need to calculate the coefficientsA and this is the most expensive part of the algorithm. So, at each stepj>1, we need to calculateA ( i , n_{j−1},n_{ j }+ α) for 0≤i ≤ n_{ j }+ α and 0≤α ≤ N_{j+1}.
We use equation (11) for 0≤x≤m i n { n_{j−1},β}. Ifm i n{n_{j−1},β+1}≠m i n { n_{j−1},β} and thereforem i n { n_{j−1},β+1}=m i n { n_{j−1},β}+1=β+1 then we also use equation (12). The cost of this calculation is dominated by the number of the summandsB ( n_{ j }− β,n_{j−1},n_{ j },x) for 0≤x ≤ m i n { n_{j−1},β} and 0≤β ≤ n_{ j }. This number is $O({n}_{j}^{2})$.
where 0≤i ≤ b,a = n_{j−1}, andb=n_{ j }+α.
To apply this recursion we need an initial value. This value depends onn_{j−1},n_{ j }, andα and, since only the first case of equation (13) contains the second summand, it is not necessary the value of this summand fori=0. There are 3 cases for calculation of all the initial values that are necessary at stepj.
Case 1: N_{ j }< n_{j−1}. That means that we use the first case of equation (13) for allα. In this case, for allα, the initial value for recursion (14) is the value of the second summand in (13) wheni=0. So we need to calculate $\left(\genfrac{}{}{0.0pt}{}{{n}_{j}+\alpha}{0}\right)\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+\alpha +0}\right)$ for 1≤α ≤ N_{j+1} and those are $\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+1}\right),\dots ,\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+{N}_{j+1}}\right)$. This takesO(N_{ j }) steps.
Case 2: n_{ j }+α_{ 0 }= n_{j−1} for someα_{ 0 }∈{0,…, N_{j+1}}. Ifα_{ 0 }>0 then, for allα ≤ α_{ 0 }, we use the first case of (13) and therefore we need to calculate $\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+1}\right),\dots ,\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+{\alpha}_{0}}\right)$, this takesO ( α_{ 0 }) steps. For allα > α_{ 0 }, we use the second case first and, starting fromi = α − α_{ 0 }, we use only the first case. So we need to calculate $\left(\genfrac{}{}{0.0pt}{}{{n}_{j}+\alpha}{\alpha {\alpha}_{0}}\right)\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}}{{n}_{j}+\alpha (\alpha {\alpha}_{0})}\right)$ forα∈{α_{ 0 }+1,…,N_{j+1}}, which are$\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}+1}{1}\right),\dots ,\left(\genfrac{}{}{0.0pt}{}{{n}_{j1}+({N}_{j+1}{\alpha}_{0})}{({N}_{j+1}{\alpha}_{0})}\right)$. This takesO(N_{j+1}−α_{ 0 }) steps. In total, we haveO ( N_{j+1}) steps.
Case 3: n_{j−1}<n_{ j }. That means that, for allα, we use the second case first and, starting fromi = n_{ j }+ α − n_{j−1}, we use only the first case. Calculate$\left(\genfrac{}{}{0.0pt}{}{{n}_{j}+1}{{n}_{j1}}\right),\dots ,\left(\genfrac{}{}{0.0pt}{}{{n}_{j}+{N}_{j+1}}{{n}_{j1}}\right)$ forO ( n_{j−1}+N_{j+1}) steps.
Each case costs at mostO ( N_{ j }). Provided that the initial values are calculated, the cost of the rest calculation at stepj is dominated by the number of the coefficientsA ( i,n_{j−1},n_{ j }+ α) for 0≤i ≤ n_{ j }+ α and 1≤α ≤ N_{j+1} and it is$O({N}_{j}^{2})$. Summing up$O({N}_{j}^{2})$X,O ( N_{j−1}), and $O({N}_{j}^{2})$ gives us the cost of each stepj, which is $O({N}_{j}^{2})$. Since 1<j ≤ m and calculation ofR ( i) for 0≤i ≤ N_{ 1 }= n takesO ( n), the algorithm doesO ( m n^{ 2 }) steps in total.
Appendix 2: Algorithm for counting fully ranked resolutions of a fully ranked constraint tree
The algorithm for counting resolutions of a constraint tree requires a few changes to count resolutions of a fully ranked constraint tree. Recall that, at each steps, we calculated the setS_{ s } which consists of the numbers of resolutions of intermediate trees withs leaves. To constructS_{ s }, for each element ofS_{s−1}, we proposed a collection of tuples that may define intermediate trees withs leaves. To accept eligible tuples we defined two matrices m and M. The general scheme for counting resolutions of a fully ranked constraint tree is the same. However, we need to define matrices m and M for a fully ranked constraint tree and describe the procedure of proposing new tuples. The procedure becomes more technical because of additional ranking.
As before, if we consider a horizontal line that is strictly between the horizontal line that passes through the nodes of rankj and the horizontal line that passes through the nodes of rank (j+1) then these matrices determine the minimal and maximal possible numbers of intersections of this line with branches of subtreei.
Let x= [x_{i,j}] be a matrix of sizet × q, wheret ≤ k andq ≤ l, and t={1,…,t}. A tuple (x,r _{t}) is eligible if

x_{ i , j }x=n_{i,j} for 1≤i ≤ t and 1≤j < q, and

${\mathtt{m}}_{i,q1}\le {x}_{q}^{i}\le {\mathtt{M}}_{i,q1}$
for 1≤i ≤ t.

,${x}_{{i}_{0},{q}_{0}}^{\prime}={x}_{{i}_{0},q}{\sum}_{j=q}^{{q}_{0}1}{n}_{{i}_{0},j}+1$

${x}_{i,{q}_{0}}^{\prime}={x}_{i,q}{\sum}_{j=q}^{{q}_{0}1}{n}_{i,j}$
fori≠i_{ 0 },

${x}_{i,j}^{\prime}={x}_{i,j}$
for 1≤i≤t andj<q, and

${x}_{i,j}^{\prime}={n}_{i,j}$
for 1≤i≤t andq≤j<q_{ 0 }.
If this tuple is eligible then it represents a new intermediate tree withs leaves.
At each steps of the algorithm, for each tuple (x,r _{ t }) such thatF^{ r }(x,r _{ t })∈ S_{s−1}, we need to propose a collection of new tuples. The first part of this procedure is Algorithm 4. To stop the procedure, when we recognise that a new tree can not havex time points, we use a predicatee x t(x) which is true if we can extend the number of time points in a new tree tox. The number of time points in a new tree can not always be extended to any arbitrary number because there may not be enough leaves of the highest rank to rerank them in such a way that there will ben_{i,j} leaves of each new rankj. Soe x t(q) is always true because the tree to which we add a new leaf already hasq time points.
Algorithm 4 Working with a tuple (x,r_{ t }) such thatF^{ r }(x,r _{ t })∈S_{s−1}(part 1)

,${x}_{t+1,{q}_{0}}^{\prime}=2{\sum}_{j=r(t+1)+1}^{{q}_{0}1}{n}_{t+1,j}$

${x}_{t+1,j}^{\prime}={n}_{t+1,j}^{\prime}$
for 1≤j<q_{ 0 },

${x}_{i,{q}_{0}}^{\prime}={x}_{i,q}{\sum}_{j=q}^{{q}_{0}1}{n}_{i,j}$
for 1≤i ≤ t,

${x}_{i,j}^{\prime}={x}_{i,j}^{\prime}$
for 1≤i ≤ t andj < q, and

${x}_{i,j}^{\prime}={n}_{i,j}^{\prime}$
for 1≤i ≤ t andq ≤ j < q_{ 0 }.
This tuple defines a new tree if it is eligible. The second part of the procedure is Algorithm 5. The values ofq_{ 0 } ande x t(q_{ 0 }) are as after running Algorithm 4.
Algorithm 5 Working with a tuple (x,r _{ t }) such thatF^{ r }(x,r _{ t })∈S_{ s −1 }(part 2)
Finally, as before, the main algorithm calculates setsS_{ s } recursively for 0≤s ≤ n, wheren is the number of leaves in a constraint tree $\phantom{\rule{0.5em}{0ex}}\mathfrak{T}$, and a unique element ofS_{ n } isS^{ r }(n,r).
Letm be a number of sampling points, i.e.,m=l−k.
Proposition 2
Ifk is fixed then the algorithm does at mostO(m) steps for each eligible tuple.
Proof
Let (x r _{ t }) be an eligible tuple. First, having all the necessary summands for equations (9) and (10), we need to calculateF^{ r }(x,r _{ t }). This takes at mostO ( t ×[ r(t+1)−r(t)]) steps. Since we assumek is a constant and sincet≤k andr ( t+1)−r(t)≤m, it isO(m). Second, we need to perform the procedure of proposing new tuples for this tuple. The most expensive part here is calculation of the last columns of matrices ${\mathbf{x}}^{\prime}$ because it involves calculation of the sums ${\sum}_{j=q}^{{q}_{0}1}{n}_{i,j}$. Note that, storing the values of ${x}_{i,q}{\sum}_{j=q}^{{q}_{0}1}{n}_{i,j}$ for 1≤i ≤ t at each stepq_{ 0 }, we can optimise these calculations. Then it takesO(t×(q − r ( t+2)) steps and it isO ( m) again. □
Proposition 3
If k is fixed then the algorithm has O(^{m}n^{k}) time complexity.
Proof
From this and Proposition 2, it follows that the algorithm does at mostO(m^{ 2 }n^{ k }) steps. □
Declarations
Acknowledgements
AJD was partially funded by a Rutherford Discovery Fellowship from the Royal Society of New Zealand.
Authors’ Affiliations
References
 Yang Z, Rannala B: Bayesian phylogenetic inference using DNA sequences: a Markov chain Monte Carlo method. Mol Biol Evol. 1997, 14 (7): 717724. 10.1093/oxfordjournals.molbev.a025811.View ArticlePubMedGoogle Scholar
 Yang Z, Rannala B: Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. Mol Biol Evol. 2005, 23: 212226. 10.1093/molbev/msj024.View ArticlePubMedGoogle Scholar
 Heled J, Drummond A: Tree priors for relaxed phylogenetics and divergence time estimation. Syst Biol. 2012, 61: 138149. 10.1093/sysbio/syr087.PubMed CentralView ArticlePubMedGoogle Scholar
 Rodrigo AG, Felsenstein J: The Evolution of HIV. 1999, Baltimore: Johns Hopkins Univ PressGoogle Scholar
 Drummond A, Pybus O, Rambaut A, Forsberg R, Rodrigo A: Measurably evolving populations. Trends Ecol Evol. 2003, 18: 481488. 10.1016/S01695347(03)002167.View ArticleGoogle Scholar
 Felsenstein J: Inferring Phylogenies. 2004, Sunderland: Sinauer AssociatesGoogle Scholar
 Murtagh F: Counting dendograms: a survey. Discrete Appl Math. 1984, 7: 191199. 10.1016/0166218X(84)900660.View ArticleGoogle Scholar
 Gordon AD: A review of hierarchical classification. J R Stat Soc A. 1987, 150 (2): 119137. 10.2307/2981629.View ArticleGoogle Scholar
 Semple C, Steel M: Phylogenetics. 2003, New York: Oxford University PressGoogle Scholar
 Griffiths RC: Counting genealogical trees. J Math Biol. 1987, 25: 423431. 10.1007/BF00277166.View ArticlePubMedGoogle Scholar
 Felsenstein J: The number of evolutionary trees. Syst Zool. 1978, 23: 2733.View ArticleGoogle Scholar
 Stadler T: Sampling‐through‐time in birth‐death trees. J Theor Biol. 2010, 267 (3): 396404. 10.1016/j.jtbi.2010.09.010.View ArticlePubMedGoogle Scholar
 Stadler T, Kouyos RD, von Wyl, Yerly S, Böni J, Bürgisser P, Klimkait T, Joos B, Rieder P, Xie D, Günthard HF, Drummond A, Bonhoeffer S, the Swiss HIV Cohort Study: Estimating the basic reproductive number from viral sequence data. Mol Biol Evol. 2012, 29: 347357. 10.1093/molbev/msr217.View ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.