Throughout this section, we limit our attention to one tree *S* from the profile
. We show how to solve the TBR-RS problem for the instance ⟨(*S*), *T*, *v*⟩ for some non-root node *v* ∈ *V* (*T*) in *O*(*n*) time. Based on this solution, it is straightforward to solve the TBR-RS problem on the instance ⟨
, *T*, *v*⟩ with-in *O*(*kn*) time as well. For clarity, we will also assume that ℒ(*S*) = ℒ(*T*). In general, if ℒ(*S*) ⊂ ℒ(*T*) then we can simply set *T* to be *T* [ℒ(*S*)]. This takes *O*(*n*) time and, consequently, does not affect the time complexity of our algorithm.

Our algorithm makes use of the LCA mapping from *S* to *T*. This mapping is defined as follows.

**Definition 6** (LCA Mapping). *Given two trees T'* and T such that ℒ(*T'*) ⊆ ℒ(*T*), *the* LCA mapping ℳ_{T', T}: *V*(*T'*) → *V*(*T*) *is the mapping* ℳ_{T', T}(*u*) = *lca*_{
T
}(ℒ(
)).

**Notation**. We define a boolean function *f*_{
T
}: *I*(*S*) → {0, 1} such that *f*_{
T
}(*u*) = 1 if there exists a node *v* ∈ *I*(*T*) such that
(*u*) =
(*v*), and *f*_{
T
}(*u*) = 0 otherwise. Thus, *f*_{
T
}(*u*) = 1 if and only if the cluster
(*u*) exists in the tree *T* as well. Additionally, we define ℱ_{
T
}= {*u* ∈ *I*(*S*): *f*_{
T
}(*u*) = 0}; that is, ℱ_{
T
}is the set of all nodes *u* ∈ *I*(*S*) such that the cluster
(*u*) does not exist in the tree *T*.

The following lemma associates the value *RF*(*S, T*) with the cardinality of the set ℱ_{
T
}.

**Lemma 1**. *RF*(*S, T*) = |*I*(*T*)| - |*I*(*S*)| + 2·|ℱ_{
T
}|.

*Proof*. Let
denote the set {*u* ∈ *I*(*S*): *f*_{
T
}(*u*) = 1}. By the definition of *RF* (*S, T*), we must have *RF*(*S, T*) = |*I*(*T*)| + |*I*(*S*)| - 2·|
|. And hence, since |
| + |ℱ_{
T
}| = *I*(*S*), we get *RF*(*S, T*) = |*I*(*T*)| - |*I*(*S*)| + 2·|ℱ_{
T
}|. □

**Lemma 2**. *For any u* ∈ *I*(*S*), *f*_{
T
}(*u*) = 1 *if and only if* |
(*u*)| = |
(ℳ_{S, T}(*u*))|.

*Proof*. If |
(*u*)| = |
(ℳ_{S, T}(*u*))| then we must have
(*u*) =
(ℳ_{S, T}(*u*)) and, consequently, *f*_{
T
}(*u*) = 1. In the other direction, if |
(*u*)| ≠ |
(ℳ_{S, T}(*u*))|, then we must have
(*u*) ⊂
(ℳ_{S, T}(*u*)) and, consequently, *f*_{
T
}(*u*) = 0. □

The LCA mapping from *S* to *T* can be computed in *O*(*n*) time [39], and consequently, by Lemmas 1 and 2, we can compute the RF distance between *S* and *T* in *O*(*n*) time as well (other *O*(*n*)-time algorithms for calculating the RF distance are presented in [30, 31]). Moreover, Lemma 1 implies that in order to find a tree *T** ∈ TBR_{
T
}(v) such that
, it is sufficient to find a tree T* ∈ TBR_{
T
}(*v*) for which
.

*Remark*: An implicit assumption here is that the leaves of both trees are labeled by integers {1, ..., *n*}. If the leaf labels are arbitrary, then we require an additional *O*(*kn* log *n*)-time preprocessing step to relabel the leaves of the trees in the given profile. Note, however, that this additional step does not add to the overall time complexity of solving the TBR-S or SPR-S problems.

We now show that the TBR-RS problem can be solved by solving two smaller problems separately and combining their solutions.

As before, we limit our attention to one tree *S* from the profile
. Given the TBR-RS instance ⟨(*S*), *T*, *v*⟩, we define a bipartition {*X*,
} of *I*(*S*), where *X* = {*u* ∈ *I*(*S*): ℳ_{S, T}(*u*) ∈ *V* (*T*_{
v
})}.

**Lemma 3**. *If u* ∈ *X, then f*_{
T'
}(*u*) = *f*_{
T
}(*u*) *for all T* ' ∈ TBR_{
T
}(*v, v*). *If u* ∈
*and y denotes the sibling of v, then f*_{
T'
}(*u*) = *f*_{
T
}(*u*), *where T'* = TBR_{
T
}(*v*, *x*, *y*) *for any x* ∈ *V* (*T*_{
v
}).

*Proof*. Consider the case when *u* ∈ *X*. Let *T'* be any tree in TBR_{
T
}(*v*, *v*) and let node *y* ∈ *V* (*T*) be such that *T'* = TBR(*v*, *v*, *y*). Thus, for any node *w* ∈ *V* (*T*_{
v
}), the subtrees *T*_{
v
}and
must be identical. Since *u* ∈ *X*, we must have ℳ_{S, T}(*u*) ∈ *T*_{
v
}and, consequently,
. Lemma 2 now implies that *f*_{
T'
}(*u*) = *f*_{
T
}(*u*).

Now consider the case when *u ∈*
. Node *y* denotes the sibling of *v* in tree *T* and let *T'* = TBR(*v*, *x*, *y*), for some *x* ∈ *V* (*T*_{
v
}). Thus, for any node *w* ∈ *V*(*T*)\*V*(*T*_{
v
}), we must have ℒ_{
T
}(*w*) = ℒ_{
T'
}(*w*). Moreover, the leaf sets of the two subtrees rooted at the children of *w* in *T* must be identical to the leaf sets of the two subtrees rooted at the children of *w* in *T'*: This implies that if ℳ_{S, T}(*u*) = *w*, then ℳ_{S, T'}(*u*) = *w* as well. By Lemma 2 we must therefore have *f*_{
T'
}(*u*) = *f*_{
T
}(*u*). □

Lemma 3 implies that a tree in TBR_{
T
}(*v*) with smallest RF distance can be obtained by optimizing the rooting for the pruned subtree, and optimizing the regraft location separately. This allows us to obtain a tree in TBR_{
T
}(*v*) with smallest RF distance by evaluating only *O*(*n*) trees. Contrast this with the naïve approach to finding a tree in TBR_{
T
}(*v*) with smallest total distance, which is to evaluate all trees obtained by rerooting the pruned subtree in all possible ways, and, for each rerooting, regrafting the subtree in all possible locations. Since there are *O*(*n*) ways to reroot the pruned subtree, and *O*(*n*) ways to regraft, this would require evaluating *O*(*n*^{2}) trees. It is interesting to note that this ability to decompose the TBR-RS problem into two simpler problems is not unique to the context of RF supertrees alone. For example, it has been observed that a similar decomposition can be achieved in the context of the gene duplication problem [37].

Thus, to solve the TBR-RS problem, we must find (i) a rerooting *T'* of the subtree *T*_{
v
}for which ℱ_{
T'
}is minimized, and (ii) a regraft location *y* for *T*_{
v
}which minimizes |ℱ_{SPR}_{(v, y)}|. Observe that the problem in part (ii) is simply the SPR-RS problem on the input instance ⟨(*S*), *T*, *v*⟩. For part (i), consider the following problem statement.

**Problem 4** (Rooting). *Given instance* ⟨
, *T*, *v*⟩, *where*
*is the profile* (*T*_{1}, ..., *T*_{
k
}), *T is a supertree on*
, *and v is a non-root node in V* (*T*), *find a node x* ∈ *V* (*T*_{
v
}) *for which RF* (
, TBR_{
T
}(*v, x, y*)) *is minimum, where y denotes the sibling of v in T*.

Note that the problem in part (i) is the Rooting problem on the input instance ⟨(*S*), *T*, *v*⟩. We show how to solve both the Rooting and the SPR-RS problems in *O*(*n*) time on instance ⟨(*S*), *T*, *v*⟩. As seen above, based on Lemma 3, this immediately implies that the TBR-RS problem for a profile consisting of a single tree can be solved in *O*(*n*) time. To solve the TBR-RS problem on instance ⟨
, *T*, *v*⟩, we simply solve the Rooting and SPR-RS problems separately on the input instance ⟨
, *T*, *v*⟩, which takes *O*(*kn*) time (see Theorems 3 and 4). We thus have the following two theorems.

**Theorem 1**. *The* TBR-*RS problem can be solved in O*(*kn*) *time*.

**Theorem 2**. *The* TBR-*S problem can be solved in O*(*kn*^{2}) *time*.