- Research
- Open access
- Published:
Characterizing compatibility and agreement of unrooted trees via cuts in graphs
Algorithms for Molecular Biology volume 9, Article number: 13 (2014)
Abstract
Background
Deciding whether there is a single tree —a supertree— that summarizes the evolutionary information in a collection of unrooted trees is a fundamental problem in phylogenetics. We consider two versions of this question: agreement and compatibility. In the first, the supertree is required to reflect precisely the relationships among the species exhibited by the input trees. In the second, the supertree can be more refined than the input trees.
Testing for compatibility is an NP-complete problem; however, the problem is solvable in polynomial time when the number of input trees is fixed. Testing for agreement is also NP-complete, but it is not known whether it is fixed-parameter tractable. Compatibility can be characterized in terms of the existence of a specific kind of triangulation in a structure known as the display graph. Alternatively, it can be characterized as a chordal graph sandwich problem in a structure known as the edge label intersection graph. No characterization of agreement was known.
Results
We present a simple and natural characterization of compatibility in terms of minimal cuts in the display graph, which is closely related to compatibility of splits. We then derive a characterization for agreement.
Conclusions
Explicit characterizations of tree compatibility and agreement are essential to finding practical algorithms for these problems. The simplicity of the characterizations presented here could help to achieve this goal.
Background
A phylogenetic tree T is an unrooted tree whose leaves are bijectively mapped to a label set . Labels represent species and T represents the evolutionary history of these species. Let be a collection of phylogenetic trees. We call a profile, refer to the trees in as input trees, and denote the combined label set of the input trees, , by . A supertree of is a phylogenetic tree whose label set is . The goal of constructing a supertree for a profile is to synthesize the information in the input trees in a larger, more comprehensive, phylogeny [1]. Ideally, a supertree should faithfully reflect the relationships among the species implied by the input trees. In reality, this is rarely achievable, because of conflicts among the input trees due to errors in constructing them or to biological processes such as lateral gene transfer and gene duplication.
We consider two classic versions of the supertree problem, based on the closely related notions of compatibility and agreement. Let S and T be two phylogenetic trees where —for our purposes, T would be an input tree and S a supertree. Let S′ be the tree obtained by suppressing any degree-two vertices in the minimal subtree of S connecting the labels in . We say that S displays T, or that T and S are compatible, if T can be derived from S′ by contracting edges. We say that tree T is an induced subtree of S, or that T and S agree, if S′ is isomorphic to T.
Let be a profile. The tree compatibility problem asks if there exists a supertree for that displays all the trees in . If such a supertree S exists, we say that is compatible and S is a compatible supertree for . The agreement supertree problem asks if there exists a supertree for that agrees with all the trees in . If such a supertree S exists, we say that S is an agreement supertree (AST) for .
Compatibility and agreement embody different philosophies about conflict. An agreement supertree must reflect precisely the evolutionary relationships exhibited by the input trees. In contrast, a compatible supertree is allowed to exhibit more fine-grained relationships among certain labels than those exhibited by an input tree. From a biological viewpoint, the differences between compatibility and agreement reflect different ways to treat polytomies —i.e., nodes of degree greater than three. Compatibility treats polytomies as soft facts: if an input tree node has degree four or more, it is not because there were multiple simultaneous speciation events, but because there is not enough information to resolve the sequence of speciation. Thus, if another input tree provides more refined information about speciation order, we can use it, provided the information is not contradicted by the remaining input trees. Agreement, in contrast, treats polytomies as hard facts. Note that compatibility and agreement are equivalent when the input trees are binary.
If all the input trees share a common label (which can be viewed as a root node), both tree compatibility and agreement are solvable in polynomial time [2, 3]. In general, however, the two problems are NP-complete, and remain so even when the trees are quartets; i.e., binary trees with exactly four leaves [4]. Nevertheless, Bryant and Lagergren showed that the tree compatibility problem is fixed-parameter tractable when parametrized by number of trees [5]. It is unknown whether or not the agreement supertree problem has the same property.
To prove the fixed-parameter tractability of tree compatibility, Bryant and Lagergren first showed that a necessary (but not sufficient) condition for a profile to be compatible is that the tree-width of a certain graph —the display graph of the profile (see Section ‘Display graphs and edge label intersection graphs’)— be bounded by the number of trees. They then showed how to express compatibility as a bounded-size monadic second-order formula on the display graph. By Courcelle’s Theorem [6, 7], these two facts imply that compatibility can be decided in time linear in the size of the display graph. Unfortunately, Bryant and Lagergren’s argument amounts essentially to only an existential proof, as it is not clear how to obtain an explicit algorithm for unrooted compatibility from it.
A necessary step towards finding a practical algorithm for compatibility —and indeed for agreement— is to develop an explicit characterization of the problem. In earlier work [8], we made some progress in this direction, characterizing tree compatibility in terms of the existence of a legal triangulation of the display graph of the profile. Gysel et al. [9] provided an alternative characterization, based on a structure they call the edge label intersection graph (ELIG) (see Section ‘Display graphs and edge label intersection graphs’). Their formulation is in some ways simpler than that of [8], allowing Gysel et al. to express tree compatibility as a chordal sandwich problem. Neither [8] nor [9] deal with agreement.
Here, we show that the connection between separators in the ELIG and cuts in the display graph (explored in Section ‘Display graphs and edge label intersection graphs’) leads to a new, and natural, characterization of compatibility in terms of minimal cuts in the display graph (Section ‘Characterizing compatibility via cuts’). We then show how such cuts are closely related to the splits of the compatible supertree (Section ‘Splits and cuts’). Next, we give a characterization of the agreement in terms of minimal cuts of the display graph (Section ‘Characterizing agreement via cuts’). To our knowledge, there was no previous characterization of the agreement supertree problem for unrooted trees. Lastly, we examine the connection between the triangulation-based and the cut-based perspectives on compatibility (Section ‘Relationship to legal triangulations’).
Preliminaries
Splits, compatibility, and agreement
A split of a label set L is a bipartition of L consisting of non-empty sets. We denote a split {X,Y} by X|Y. A split is non-trivial if neither of its sets is a singleton; otherwise, it is trivial. Let T be a phylogenetic tree. Let e be an edge of T. Deletion of e disconnects T into two subtrees T1 and T2. If L1 and L2 denote the set of all labels in T1 and T2, respectively, then L1|L2 is a split of . We denote by σ e (T) the split corresponding to edge e of T; if e is a leaf edge, then σ e (T) is a trivial split. Let Σ(T) denote the set of all splits corresponding to internal edges of T and Σ t r i v (T) denote the set of all (trivial) splits corresponding to leaf edges of T.
A tree T displays a split X|Y if there exists an internal edge e of T where σ e (T)=X|Y. A set of splits of a label set L is compatible if there exists a tree that displays all the splits in the set. It is well-known that two splits A1|A2 and B1|B2 are compatible if and only if at least one of A1∩B1, A1∩B2, A2∩B1 and A2∩B2 is empty [10]. Note that a trivial split of L is compatible with every split of L.
Theorem 1 (Splits-Equivalence Theorem [10, 11]).
Let Σ be a collection of splits of a label set X that includes all trivial splits. Then, Σ=Σ(T)∪Σ t r i v (T) for some phylogenetic tree T with label set X if and only if the splits in Σ are pairwise compatible. Tree T is unique up to isomorphism.
Let S be a phylogenetic tree and let Y be a subset of . Then, S|Y denotes the tree obtained by suppressing any degree-two vertices in the minimal subtree of S connecting the labels in Y. Now, let T be a phylogenetic tree such that . Then, S displays T if and only if ; T and S agree if and only if .
Cliques, separators, cuts, and triangulations
Let G be a graph. We represent the vertices and edges of G by V(G) and E(G) respectively. A clique of G is a complete subgraph of G. A clique H of G is maximal if there is no other clique H′ of G where V(H)⊂V(H′). For any U⊆V(G), G−U is the graph derived by removing vertices of U and their incident edges from G. For any F⊆E(G), G−F is the graph with vertex set V(G) and edge set E(G) ∖ F.
For any two nonadjacent vertices a and b of G, an a-b separator of G is a set U of vertices where U⊂V(G) and a and b are in different connected components of G−U. An a-b separator U is minimal if for every U′⊂U, U′ is not an a-b separator. A set U⊆V(G) is a minimal separator if U is a minimal a-b separator for some nonadjacent vertices a and b of G. We represent the set of all minimal separators of graph G by △ G . Two minimal separators U and U′ are parallel if G−U contains at most one component H where V(H)∩U′≠∅.
A connected component H of G−U is full if for every u∈U there exists some vertex v∈H where {u,v}∈E(G).
Lemma 1 ([12]).
For a graph G and any U⊂V(G), U is a minimal separator of G if and only if G−U has at least two full components.
A chord is an edge between two nonadjacent vertices of a cycle. A graph H is chordal if and only if every cycle of length four or greater in H has a chord. A chordal graph H is a triangulation of graph G if V(G)=V(H) and E(G)⊆E(H). The edges in E(H) ∖ E(G) are called fill-in edges of G. A triangulation is minimal if removing any fill-in edge yields a non-chordal graph.
A clique tree of a chordal graph H is a pair (T,B) where (i) T is a tree, (ii) B is a bijective function from vertices of T to maximal cliques of H, and (iii) for every vertex v∈H, the set of all vertices x of T where v∈B(x) induces a subtree in T. Property (iii) is called coherence.
Let be a collection of subsets of V(G). We represent by the graph derived from G by making the set of vertices of X a clique for every . The next result summarizes basic facts about separators and triangulations (see [12–14]).
Theorem 2.
Let be a maximal set of pairwise parallel minimal separators of G and H be a minimal triangulation of G. Then, the following statements hold.
-
1.
is a minimal triangulation of G.
-
2.
Let (T,B) be a clique tree of . There exists a minimal separator if and only if there exist two adjacent vertices x and y in T where B(x)∩B(y)=F.
-
3.
△ H is a maximal set of pairwise parallel minimal separators of G and .
A cut in a connected graph G is a subset F of edges of G such that G−F is disconnected. A cut F is minimal if there does not exist F′⊂F where G−F′ is disconnected. Note that if F is minimal, G−F has exactly be two connected components. Two minimal cuts F and F′ are parallel if G−F has at most one connected component H where E(H)∩F′≠∅.
Display graphs and edge label intersection graphs
We now introduce the two main notions that we use to characterize compatibility and agreement: the display graph and the edge label intersection graph. We then present some known results about these graphs, along with new results on the relationships between them. Here and in the rest of the paper, [m] denotes the set {1,…,m}, where m is a positive integer. Since for any phylogenetic tree T there is a bijection between the leaves of T and , we refer to the leaves of T by their labels.
Let be a profile. We assume that for any i,j∈[k] such that i≠j, the sets of internal vertices of input trees T i and T j are disjoint. The display graph of , denoted by , is a graph whose vertex set is and edge set is (see Figure 1). A vertex v of is a leaf if . Every other vertex of is an internal. An edge of is internal if its endpoints are both internal.
A triangulation G′ of is legal if it satisfies the following conditions.
-
1.
For every clique C of G ′, if C contains an internal edge, then it contains no other edge of .
-
2.
No fill-in edge in G ′ has a leaf as an endpoint.
Theorem 3 (Vakati, Fernández-Baca [8]).
A profile of unrooted phylogenetic trees is compatible if and only if has a legal triangulation.
In what follows, we assume that is connected. If it is not, the connected components of induce a partition of into sub-profiles such that for each sub-profile , is a connected component of . It is easy to see that is compatible if and only if each sub-profile is compatible.
The edge label intersection graph of, denoted , is the line graph of [9]. That is, the vertex set of LG(G) is and two vertices of are adjacent if the corresponding edges in share an endpoint. (We should note that Gysel et al. [9] refer to as the modified edge label intersection graph.) For an unrooted tree T, LG(T) denotes LG({T}).
Observation 1.
Let F be a set of edges of and let where m≥2. Then, v1,v2,…,v m is a path in if and only if {v1,v2},…,{vm−1,v m } is a path in in .
Thus, if is connected, so is . Hence, in what follows, we assume that is connected.
A fill-in edge for is valid if for every , at least one of the endpoints of the edge is not in LG(T). A triangulation H of is restricted if every fill-in edge of H is valid.
Theorem 4 (Gysel et al. [9]).
A profile of unrooted phylogenetic trees is compatible if and only if has a restricted triangulation.
A minimal separator F of is legal if for every , all the edges of T in F share a common endpoint; i.e., F∩E(T) is a clique in LG(T). The following theorem was mentioned in [9]. For future reference, we formally state it and prove it here.
Theorem 5.
A profile is compatible if and only if there exists a maximal set of pairwise parallel minimal separators in where every separator in is legal.
Proof.
Our approach is similar to the one used by Gusfield in [15]. Assume that is compatible. From Theorem 4, there exists a restricted triangulation H of . We can assume that H is minimal (if it is not, simply delete fill-in edges repeatedly from H until it is minimal). Let . From Theorem 2, is a maximal set of pairwise parallel minimal separators of and . Suppose contains a separator F that is not legal. Let {e,e′}⊆F where {e,e′}⊆E(T) for some input tree T and e∩e′=∅. The vertices of F form a clique in H. Thus, H contains the edge {e,e′}. Since {e,e′} is not a valid edge, H is not a restricted triangulation, a contradiction. Hence, every separator in is legal.
Let be a maximal set of pairwise parallel minimal separators of where every separator in is legal. From Theorem 2, is a minimal triangulation of . If is a fill-in edge, then e∩e′=∅ and there exists a minimal separator where {e,e′}⊆F. Since F is legal, if {e,e′}⊆E(T) for some input tree T then e∩e′≠∅. Thus, e and e′ are not both from LG(T) for any input tree T. Hence, every fill-in edge in is valid, and is a restricted triangulation.
Let u of be a vertex of some input tree, We write Inc(u) to denote the set of all edges of incident on u. Equivalently, Inc(u) is the set of all vertices e of such that u∈e.
Let F be a cut of the display graph . F is legal if for every tree , the edges of T in F are incident on a common vertex; i.e., if F∩E(T)⊆Inc(u) for some u∈V(T). F is nice if F is legal and each connected component of has at least one edge.
Lemma 2.
Let F be a subset of . Then, F is a legal minimal separator of if and only if F is a nice minimal cut of .
To prove the Lemma 2, we need two auxiliary lemmas and a corollary.
Lemma 3.
Let F be any minimal separator of and u be any vertex of any input tree. Then, Inc(u)⫅̸F.
Proof.
Suppose F is a minimal a-b separator of and u is a vertex of some input tree such that Inc(u)⊆F. Consider any vertex e∈Inc(u). Then, there exists a path π from a to b in where e is the only vertex of F in π. If such a path π did not exist, then F−e would still be an a-b separator, and F would not be minimal, a contradiction. Let e1 and e2 be the neighbors of e in π and let e={u,v}. Since Inc(u)⊆F, π does not contain any other vertex e′ where u∈e′. Thus, e∩e1={v} and e∩e2={v}. Let π=a,…,e1,e,e2,…,b. Then π′=a,…,e1,e2,…,b is also a path from a to b. But π′ does not contain any vertex of F, contradicting the assumption that F is a separator of . Hence, neither such a minimal separator F nor such a vertex u exist.
Lemma 4.
If F is a minimal separator of , then has exactly two connected components.
Proof.
Assume that has more than two connected components. By Lemma 1, has at least two full components. Let H1 and H2 be two full components of . Let H3 be a connected component of different from H1 and H2. By assumption is connected. Thus, there exists an edge {e,e3} in where e∈F and e3∈H3. Since H1 and H2 are full components, there exist edges {e,e1} and {e,e2} in where e1∈V(H1) and e2∈V(H2).
Let e={u,v}, and assume without loss of generality that u∈e∩e3. Then, there is no vertex f∈V(H1) where u∈e∩f. Thus, v∈e∩e1. Similarly, there is no vertex f∈V(H2) such that u∈f∩e or v∈f∩e. But then H2 does not contain a vertex adjacent to e, so H2 is not a full component, a contradiction.
Corollary 1.
If F is a minimal separator of , then is connected for any F′⊂F.
Proof of Lemma 2.
We prove that if F is a legal minimal separator of then F is a nice minimal cut of . The proof for the other direction is similar and is omitted.
First, we show that F is a cut of . Assume the contrary. Let {u,v} and {p,q} be vertices in different components of . Since is connected, there is a path between vertices u and q in this graph. Also, {u,v}∉F and {p,q}∉F. Thus, by Observation 1 there is also a path between vertices {u,v} and {p,q} of . This implies that {u,v} and {p,q} are in the same connected component of , a contradiction. Thus F is a cut.
Next we show that F is a nice cut of . For every all the vertices of LG(T) in F form a clique in LG(T). Thus, all the edges of T in F are incident on a common vertex, so F is legal. To complete the proof, assume that has a connected component with no edge and let u be the vertex in one such component. Then, Inc(u)⊆F. But F is a minimal separator of , and by Lemma 3, Inc(u) ⊈ F, a contradiction. Thus, F is a nice cut.
Lastly, we show that F is a minimal cut of . Assume, on the contrary, that there exists F′⊂F where is disconnected. Since F′⊂F and every connected component of has at least one edge, every connected component of also has at least one edge. Let {u,v} and {p,q} be the edges in different components of . By Corollary 1, is connected and thus, there is a path between {u,v} and {p,q} in . By Observation 1 there must also be a path between vertices u and p in . Hence, edges {u,v} and {p,q} are in the same connected component of G−F′, a contradiction. Thus, F is a minimal cut.
Lemma 5.
Two legal minimal separators F and F′ of are parallel if and only if the nice minimal cuts F and F′ are parallel in .
Proof.
Assume that separators F and F′ of are parallel, but cuts F and F′ of are not. Then, there exists a set {{u,v},{p,q}}⊆F′ where {u,v} and {p,q} are in different components of . Since F and F′ are parallel separators in , and F does not contain {u,v} and {p,q}, there exists a path between vertices {u,v} and {p,q} in . Then, by Observation 1 there also exists a path between vertices u and q in . Thus, {u,v} and {p,q} are in the same connected component of , a contradiction.
The other direction can be proved similarly, using Observation 1.
The next lemma, from [9], follows from the definition of restricted triangulation.
Lemma 6
Let H be a restricted triangulation of and let (T,B) be a clique tree of H. Let e={u,v} be any vertex in . Then, there does not exist a node x∈V(T) where B(x) contains vertices from both Inc(u) ∖ e and Inc(v) ∖ e.
Lemma 7.
Let T be a tree in and suppose F is a minimal cut of that contains precisely one edge e of T. Then, the edges of the two subtrees of T−e are in different connected components of .
Proof.
Let e={u,v}. For each x∈e, let T x denote the subtree containing vertex x in T−e. For each vertex x∈e, all the edges of T x are in the same connected component of as x, because e is the only edge of T in F. Since F is a minimal cut of , the endpoints of e are in different connected components of . Hence, the edges of T u and T v are also in different connected components of .
Characterizing compatibility via cuts
A set of cuts of is complete if, for every input tree and every internal edge e of T, there is a cut where e is the only edge of T in F.
Lemma 8.
has a complete set of pairwise parallel nice minimal cuts if and only if it has a complete set of pairwise parallel legal minimal cuts.
Proof.
The “only if part” follows from the definition of a nice cut. Let be a complete set of pairwise parallel legal minimal cuts. Consider any minimal subset of that is also complete. Let F be a legal minimal cut of . Since is minimal, there exists an edge e∈F of some input tree T such that e is the only edge of T in F. Also, since e is an internal edge, both subtrees of T−e have at least one edge each. Thus by Lemma 7, both connected components of have at least one edge each. Hence, F is a nice minimal cut of . It follows that is a complete set of pairwise parallel nice minimal cuts of .
We now characterize the compatibility of a profile in terms of minimal cuts in the display graph of the profile.
Theorem 6.
A profile of unrooted phylogenetic trees is compatible if and only if there exists a complete set of pairwise parallel legal minimal cuts for .
Example 1.
For the display graph of Figure 1, let , where F1={{1,2},{5,6}}, F2={{2,3},{6,7},{5,6}}, F3={{4,5},{1,2},{1,c}} and F4={{6,7},{2,f}}. Then, is a complete set of pairwise parallel nice minimal cuts.
Theorem 6 has an analog in terms of . Let us say that a set of legal minimal separators of is complete if for every internal edge e of an input tree T, there exists a separator where e is the only vertex of LG(T) in F.
Theorem 7.
A profile of unrooted phylogenetic trees is compatible if and only if there exists a complete set of pairwise parallel legal minimal separators for .
This result is a direct consequence of Theorem 6 and Lemmas 2, 5, and 8, so we omit its proof. Instead, we focus on the proof of Theorem 6, for which we need the next fact.
Lemma 9.
The following two statements are equivalent.
-
1.
There exists a maximal set of pairwise parallel minimal separators of where every separator in is legal.
-
2.
There exists a complete set of pairwise parallel nice minimal cuts for .
Proof.
(i) ⇒ (ii): We show that for every internal edge e={u,v} of an input tree T there exists a minimal separator in that contains only vertex e from LG(T). Then it follows from Lemmas 2 and 5 that is a complete set of pairwise parallel nice minimal cuts for .
As shown in the proof of Theorem 5, is a restricted minimal triangulation of . Let (S,B) be a clique tree of . By definition, the vertices in each of the sets Inc(u) and Inc(v) form a clique in . Consider any vertex p of S where Inc(u)⊆B(p) and any vertex q of S where Inc(v)⊆B(q). (Since (S,B) is a clique tree of , such vertices p and q must exist.) Also, by Lemma 6, p≠q, B(p)∩(Inc(v) ∖ {e})=∅ and B(q)∩(Inc(u) ∖ {e})=∅.
Let π=p,x1,x2,…,x m ,q be the path from p to q in S where m≥0. Let x0=p and xm+1=q. Let x i be the vertex nearest to p in path π where i∈[m+1] and B(x i )∩(Inc(u) ∖ {e})=∅. Let F=B(xi−1)∩B(x i ). Then by Theorem 2, . Since Inc(u)∩Inc(v)={e}, by the coherence property, e∈B(x j ) for every j∈[m]. Thus, e∈F. By Lemma 6, B(xi−1)∩(Inc(v) ∖ {e})=∅. Since B(x i )∩(Inc(u) ∖ {e})=∅, F∩Inc(u)={e} and F∩Inc(v)={e}. Thus, for every vertex e′∈LG(T) where e≠e′ and e∩e′≠∅, e′∉F. Also, since every separator in is legal, we have f∉F for every vertex f∈LG(T) where f∩e=∅. Thus, e is the only vertex of LG(T) in F.
(i) ⇐ (ii): Consider any complete set of pairwise parallel nice minimal cuts of . By Lemmas 2 and 5, is a set of pairwise parallel legal minimal separators of . There exists a maximal set of pairwise parallel minimal separators where .
Assume that contains a minimal separator F that is not legal. Then, there must exist a tree where at least two nonincident edges e1={x,y} and e2={x′,y′} of T are in F. Consider any internal edge e3 in T where e1 and e2 are in different components of T−e3. Such an edge exists because e1 and e2 are nonincident. Since is complete, there exists a cut where e3 is the only edge of T in F′. Since F and F′ are in , they are parallel to each other and vertices e1 and e2 are in the same connected component of . Thus, by Observation 1, there exists a path between vertices x and x′ in and edges e1 and e2 are also in the same connected component of . But by Lemma 7 that is impossible.
Thus, every separator of is legal and is a maximal set of pairwise minimal separators of where every separator in is legal.
Proof of Theorem 6.
By Theorem 5 and Lemma 9, profile is compatible if and only if there exists a complete set of pairwise parallel nice minimal cuts for . The rest follows from Lemma 8.
Splits and cuts
We first argue that for every nice minimal cut of we can derive a split of . We use the following notation: if H is a subgraph of , then represents the set of all leaves of H
Lemma 10.
Let F be a nice minimal cut of and let G1 and G2 be the two connected components of . Then, L(G i )≠∅ for i∈{1,2}. In particular, is a split of .
Proof.
Consider G i for each i∈{1,2}. We show that is non-empty. Since F is nice, G i contains at least one edge e of . If e is a non-internal edge, then is non-empty. Assume that e={u,v} is an internal edge of some input tree T. If F does not contain an edge of T, then and thus is non-empty. Assume that F contains one or more edges of T. Let T u , T v be the two subtrees of T−e. Since F is a nice minimal cut, F contains edges from either T u or T v but not both. Without loss of generality assume that F does not contain edges from T u . Then, every edge of T u is in the same component as e. Since T u contains at least one leaf, is non-empty. Thus, is a split of .
Let σ(F) denote the split of induced by a nice minimal cut F. If is a set of nice minimal cuts of , denotes the set of all the non-trivial splits in . The following result expresses the relationship between complete sets of nice minimal cuts and the compatibility of splits.
Theorem 8.
If has a complete set of pairwise parallel nice minimal cuts , then is compatible and any compatible tree for is also a compatible tree for .
Example 2.
For the complete set of pairwise parallel nice minimal cuts for the display graph of Example 1, we have σ(F1)=a b c|d e f g, σ(F2)=a b c f g|d e, σ(F3)=a b|c d e f g, and σ(F4)=a b c d e|f g. Note that these splits are pairwise compatible.
The proof of Theorem 8 uses the following lemma.
Lemma 11.
Let F1 and F2 be two parallel nice minimal cuts of . Then, σ(F1) and σ(F2) are compatible.
Proof.
Let σ(F1)=U1|U2 and σ(F2)=V1|V2. Assume that σ(F1) and σ(F2) are incompatible. Thus, U i ∩V j ≠∅ for every i,j∈{1,2}. Let a∈U1∩V1, b∈U1∩V2, c∈U2∩V1 and d∈U2∩V2. Since {a,b}⊆U1, there exists a path π1 between leaves a and b in . But a and b are in different components of . Thus, an edge e1 of path π1 is in the cut F2. Similarly, {c,d}⊆U2 and there exists a path π2 between labels c and d in . Since c and d are in different components of , cut F2 contains an edge e2 of path π2. But π1 and π2 are in different components of , so edges e1 and e2 are in different components of . Since {e1,e2}⊆F2, the cuts F1 and F2 are not parallel, a contradiction.
Proof of Theorem 8.
The compatibility of follows from Lemma 11 and Theorem 1. Let S be a compatible tree for Σ(F), let T be an input tree of , let , and let e be any internal edge of T. We show that S′ displays σ(e)
Let σ(e)=A|B. There exists a cut where e is the only edge of T in F. By Lemma 7, since F is minimal, the leaves of sets A and B are in different components of . Thus, if σ(F)=A′|B′ then, up to renaming of sets, we have A⊆A′ and B⊆B′. Because S displays σ(F), S′ also displays σ(e). Since S′ displays all the splits of T, T can be obtained from S′ by contracting zero or more edges [10]. Thus, S displays T. Since S displays every tree in , S is a compatible tree for .
Characterizing agreement via cuts
The following characterization of agreement is similar to the one for tree compatibility given by Theorem 6, except for an additional restriction on the minimal cuts.
Theorem 9.
A profile has an agreement supertree if and only if has a complete set of pairwise parallel legal minimal cuts where, for every cut and for every , there is at most one edge of T in F.
Example 3.
One can verify that the display graph of Figure 1 does not meet the conditions of Theorem 9 and, thus, the associated profile does not have an AST. On the other hand, for the display graph of Figure 2, let , where F1={{1,2},{4,5}}, F2={{1,2},{5,6}} and F3={{2,3},{6,d}}. For any given input tree T, every cut in has at most one edge of T. Also, is a complete set of pairwise parallel legal minimal cuts. Thus, by Theorem 9, the input trees of Figure 2 have an AST
The analogue of Theorem 9 for stated next follows from Theorem 9 and Lemmas 2, 5, and 8.
Theorem 10.
A profile has an agreement supertree if and only if has a complete set of pairwise parallel legal minimal separators where, for every and every , there is at most one vertex of LG(T) in F.
Theorem 9 follows from Lemma 8 and the next result.
Lemma 12.
A profile has an agreement supertree if and only if has a complete set of pairwise parallel nice minimal cuts where, for every cut and every , there is at most one edge of T in F.
The rest of the section is devoted to the proof of Lemma 12
Let S be an AST of and let e={u,v} be an edge of S. Let S u and S v be the subtrees of S−e containing u and v, respectively. Let and . Thus, σ e (S)=L u |L v . Assume that there exists an input tree T where for each x∈{u,v}. Then there exists an edge f∈E(T) where, if σ f (T)=A1|A2, then A1⊆L u and A2⊆L v . (If there were no such edge, would contain a split that is not in T and would thus not be isomorphic to T.) We call e an agreement edge of S corresponding to edge f of T. Note that there does not exist any other edge f′ of T where e is also an agreement edge of S with respect to edge f′ of T.
The cut function of an AST S of is the mapping Ψ from E(S) to subsets of edges of defined as follows. For every e∈E(S), an edge f of an input tree T is in Ψ(e) if and only if e is an agreement edge of S corresponding to edge f of T. Observe that Ψ is uniquely defined. Given an edge e∈E(S), we define a set V x for each x∈e as follows. For every , let Vx,T consist of all the vertices of the minimal subtree of T connecting the labels in . Then, . Note that if e={u,v} then {V u ,V v } is a partition of .
Lemma 13.
Let S be an AST of and let Ψ be the cut function of S. Then, for every edge e∈E(S),
-
(i)
Ψ(e) is a cut of and
-
(ii)
Ψ(e) is a minimal cut of if and only if has exactly two connected components.
Proof.
(i) Let e={u,v}. We show that does not contain an edge whose endpoints are in distinct sets of {V u ,V v }. Assume the contrary. Let f={x,y} be an edge of where x∈V u and y∈V v .
Since , f∉Ψ(e). Suppose f is an edge of input tree T. There are two cases.
-
1.
Ψ(e ) does not contain an edge of T. Then, there exists an endpoint p of e where . Without loss of generality, let u=p. Then, V(T)⊆V u and thus y∈V u , a contradiction.
-
2.
Ψ(e ) contains an edge f ′≠f of T. Let f ′={r,s} and let L r ⊆L u and L s ⊆L v . Let x,r be the vertices of f and f ′ where L x ⊂L r . Since T is a phylogenetic tree, such vertices x and r exist. Since L r ⊆L u , both the endpoints of f are in V u , a contradiction.
Thus, does not contain an edge whose endpoints are in different sets of {V u ,V v }. Since V u and V v are non-empty, Ψ(e) is a cut of .
(ii) The “only if” part follows from the definition of a minimal cut. We now prove the “if” part. Let e={u,v}. Assume that has exactly two connected components. From the proof of (i), V u and V v are the vertex sets of those two connected components. Consider any edge f∈Ψ(e). The endpoints of f are in different sets of {V u ,V v } and thus are in different connected components of . Hence, is connected. Thus, if has exactly two connected components, Ψ(e) is a minimal cut of .
The next observation summarizes two basic facts about cut functions.
Observation 2.
Let S be an AST of . Then, the cut function Ψ of S has the following properties.
-
1.
For any two distinct edges e 1 and e 2 in E(S), Ψ(e 1)≠Ψ(e 2).
-
2.
Let e={u,v} be an edge of S. For any input tree T where , all the labels of are in the same connected component of .
Let S be an AST of and let e be an edge of S. Although Lemma 13 shows that Ψ(e) is a cut of , Ψ(e) may not be minimal. We now argue that we can always construct an agreement supertree whose cut function gives minimal cuts
Lemma 14.
If has an AST, then it has an AST S of whose cut function Ψ satisfies the following: For every edge e∈S, Ψ(e) is a minimal cut of .
We prove Lemma 14 by arguing that any AST that fails to satisfy the required cut minimality property can be transformed into one that does, through repeated application of the “splitting” operation, defined next.
Suppose e=(u,v) is a an edge of S where Ψ(e) is not minimal. Let {L1,…,L m } be the partition of L v where for every i∈[m], for some connected component C in . We assume without loss of generality that m>1 (if not, we can just exchange the roles of u and v). Let R v be the rooted tree derived from S v by distinguishing vertex v as the root. Let Rv,i be the (rooted) tree obtained from the minimal subtree of R v connecting the labels in L i by distinguishing the vertex closest to v as the root and suppressing every other vertex that has degree two. To split edge e at u is to construct a new tree S′ from S in two steps: (i) delete the vertices of R v from S and (ii) for every i∈[m], add an edge from u to the root of Rv,i.
Observation 3.
Let S be an AST of and let Ψ be the cut function of S. Let S′ be the tree derived by splitting edge e={u,v} at u. Consider any connected component C of where . Then, for every , S|X and are isomorphic.
The next observation follows from the definition of AST.
Observation 4.
Let S and T be two phylogenetic trees where and T agrees with S. Then, T and S|U agree for every U such that .
Lemma 15.
Let S be an AST of and let e={u,v} be an edge of S. Let S′ be the tree derived by splitting edge e at u. Then, S′ is an AST of .
Proof.
By construction, S′ is a phylogenetic tree over . As before, let {L1,…,L m } be the partition of L v where for every i∈[m], for some connected component C in . Consider any input tree T of profile . We prove that T and S′ agree. There are three cases. Case 1:. Since , by Observation 4, T and agree. By the definition of the split operation, trees and are isomorphic. Thus, T and S′ agree. Case 2:. By Observation 2(ii), for some i∈[m]. Since T and S agree and , by Observation 4, T and agree. By construction, trees and are isomorphic. Thus, T and S′ agree. Case 3: and . By Observation 2(ii), for some i∈[m]. Since T and S agree and , by Observation 4, T also agrees with . By construction, trees and are isomorphic. Thus, T and agree. It follows that T and S′ agree
Thus, S′ is an AST of .
Observe that if S′ is the tree obtained by splitting edge e={u,v} of S at u, then the edges of E(S u ) are in both S and S′.
Therefore, E(S)∖E(S u )=E(S)∖E(S′) and E(S′)∖E(S u )=E(S′) ∖ E(S).
Lemma 16.
Let S be an AST of and let e={u,v} be an edge of S. Let S′ be the tree obtained by splitting e at u. Let Ψ, Ψ′ be the cut functions of S and S′ respectively. Consider any edge f∈E(S′)∖E(S). There exists an edge e′∈E(S)∖E(S′) where Ψ′(f)⊆Ψ(e′). Furthermore, if Ψ(e′) is a minimal cut of then Ψ′(f)=Ψ(e′) and Ψ′(f) is a minimal cut of .
Proof.
Let f={x,y} and let x be the vertex of f where L x ⊆L v . Let S p be the minimal subtree of S connecting the labels in L x . Let p be the vertex of S p closest to u in S. Let q be the vertex adjacent to p in the path from p to u. Let e′={p,q}. Note that, L x ⊆L p . Since L x ⊆L v , e′ is an edge of E(S)∖E(S′). Consider any tree T that has an edge f1 in Ψ′(f). We show that . It then follows that f1∈Ψ(e′) and thus, Ψ′(f)⊆Ψ(e′).
Since L x ⊆L p , . By Observation 2(ii), all the labels in are in the same connected component of . Thus, all the labels in are in the same connected of . If , then and are not isomorphic, contradicting Observation 3. Thus,
Assume that Ψ(e′) is a minimal cut of . Then, all the labels in L p are in the same connected component of . By Observation 3, L p =L x . Thus, Ψ′(f) is also a minimal cut of .
Lemma 17.
Let S be an AST of and Ψ be the cut function of S. Let E0 be the set of all edges e of S such that Ψ(e) is not a minimal cut of . Choose any edge e∗={u,v}∈E0 such that . Let S′ be the tree obtained from S by splitting e∗ at u and let Ψ′ be the cut function of S′. We have the following.
-
1.
For any edge f∈E(S ′), if |Ψ ′(f)|>|Ψ(e ∗)| then Ψ ′(f) is a minimal cut of .
-
2.
Let P be the set of all edges x in S such that |Ψ(e ∗)|=|Ψ(x)| and Ψ(x) is not a minimal cut. Let P ′ be the set of all edges x in S ′ such that |Ψ(e ∗)|=|Ψ(x)| and Ψ ′(x) is not a minimal cut. Then, |P ′|<|P|.
Proof.
(i) Consider any edge f∈E(S′) where |Ψ′(f)|>|Ψ(e∗)|. If f∈E(S)∩E(S′), then Ψ(f)=Ψ′(f). Since |Ψ(f)|>|Ψ(e∗)|, by assumption Ψ(f) is a minimal cut of . Thus, Ψ′(f) is also a minimal cut of . Assume that f∈E(S′)∖E(S). By Lemma 16, there exists an edge e′∈E(S) where Ψ′(f)⊆Ψ(e′). Since |Ψ′(f)|>|Ψ(e∗)|, |Ψ(e′)|>|Ψ(e∗)|. Thus, by assumption Ψ(e′) is a minimal cut of . From Lemma 16, it follows that Ψ(e′)=Ψ′(f) and Ψ′(f) is a minimal cut of .
(ii) Let Q=P∩(E(S)∖E(S′)) and Q′=P′∩(E(S′)∖E(S)). It suffices to show that |Q′|<|Q|. Consider any edge f∈Q′. By Lemma 16, there exists an edge e′∈E(S)∖E(S′) where Ψ′(f)⊆Ψ(e′). Thus, |Ψ(e′)|≥|Ψ′(f)|. If |Ψ(e′)|>|Ψ′(f)|, then by assumption Ψ(e′) is a minimal cut and thus by Lemma 16 |Ψ(e′)|=|Ψ′(f)|, a contradiction.
Thus, Ψ(e′)=Ψ′(f). Also, since Ψ′(f) is not a minimal cut, by Lemma 16, neither is Ψ(e′). If e′=e∗, then all vertices of V v are in the same connected component of , contradicting the assumption that it is possible to split e∗ at u. Thus, e′≠e∗. Hence, we can conclude that for every edge f∈Q′, there exists an edge e′∈(Q∖{e∗}), where Ψ′(f)=Ψ(e′).
Let f1 and f2 be any two distinct edges in Q′. Let e1 and e2 be the edges of Q∖{e∗} where Ψ′(f1)=Ψ(e1) and Ψ′(f2)=Ψ(e2). If e1=e2, then Ψ′(f1)=Ψ′(f2), contradicting Observation 2(i). Thus, e1≠e2. Since e∈Q and e∉Q′, it follows that |Q′|≤|Q|−1, and thus |Q′|<|Q|.
Proof of Lemma 14.
Let S be an AST of and Ψ be the cut function of S. Do the following while S contains an edge e such that Ψ(f) is not a minimal cut of : Pick an edge e∗ satisfying the conditions of Lemma 17, and apply a split operation at e∗; let S′ be the resulting tree. By Lemma 15, S′ is also an AST of . Let Ψ′ be the cut function of S′. Set S to S′ and Ψ to Ψ′
We only need to prove that the total number of iterations, s, is finite. An AST of has at most vertices. Also, |Ψ(e)|≥1 for any edge e of S. It thus follows from Lemma 17 that s is finite.
Proof of Lemma 12
(⇐) Assume that has an AST. Then, by Lemma 14, has an AST S whose cut function Ψ has the property that, for every edge e∈E(S), Ψ(e) is a minimal cut of . Let be the set of all Ψ(e) such that e is an internal edge of S. Then, is a set of minimal cuts of . Further, by definition of Ψ, for every and for every , F contains at most one edge of T. Thus every cut in is legal. We now prove that is a complete set of pairwise parallel nice minimal cuts of .
We first argue that every cut in is nice. Consider any . Let e={u,v} be the internal edge of S where Ψ(e)=F. Let T be an input tree that has an internal edge f in Ψ(e). Since e is an internal edge at least one such input tree exists; otherwise Ψ(e) is not a minimal cut. Now, by definition, f is the only edge of T in Ψ(e), so, by Lemma 7, each of the two connected components of has at least one non-internal edge of T. Hence, F is a nice minimal cut of .
To prove that the cuts in are pairwise parallel, we argue that for any two distinct internal edges e1 and e2 of S, Ψ(e1) and Ψ(e2) are parallel. There exist vertices x∈e1 and y∈e2 where L x ⊆L y . For every edge f∈Ψ(e1), we show that f∈Ψ(e2) or f⊆V y . It then follows that Ψ(e1) and Ψ(e2) are parallel. Let f be an edge of input tree T. Then there exists z∈f where L z ⊆L x . Thus, L z ⊆L y and z∈V y . By Lemma 13, all the vertices of V y are in the same connected component of . Thus, f∈Ψ(e2) or f⊆V y .
Lastly, we show that is complete. Consider any internal edge f={p,q} of some input tree T. Since S is an AST of , there exists an edge e={u,v} where, up to relabeling of sets, L p ⊆L u and L q ⊆L v . Thus, e is an agreement edge of S corresponding to f, so f∈Ψ(e). Since f is an internal edge, e is also an internal edge of S and thus . Hence, for every internal edge f of an input tree there is a cut where f∈F. Thus, § is complete.
(⇒) Assume that there exists a complete set of pairwise parallel nice minimal cuts of where, for every and every , F contains at most one edge of T. By Theorem 18, is compatible and, by Theorem 1, there exists an unrooted tree S where . We prove that S is an AST of by showing that for every input tree .
Consider an input tree T of . Let X1|X2 be the non-trivial split of T corresponding to edge f∈E(T). Since is complete, there exists a cut where . If σ(F)=Y1|Y2, by Lemma 7, up to relabeling of sets, X i ⊆Y i for every i∈{1,2}. Since σ(F) is a split of S, this implies that .
Consider any non-trivial split P1|P2 of Σ(S) where for each i∈{1,2}. Let for each i∈{1,2}. Since , there exists a cut where σ(F)=P1|P2. Since P1 and P2 are in different connected components of , Q1 and Q2 are also in different connected components of . Thus, F contains an edge f′ of T. Since F does not contain any other edge of T, σ(f′)=Q1|Q2. Thus, .
Relationship to legal triangulations
Taken together, Theorems 3 and 6 say that has a complete set of pairwise parallel legal minimal cuts if and only if it has a legal triangulation. The connection between legal triangulations and complete sets of pairwise parallel legal minimal cuts is through the existence (or nonexistence) of a compatible tree. Here we make the connection explicit, showing how, from a set of pairwise parallel legal minimal cuts, one can construct a legal triangulation of without going through a compatible tree. We leave the other direction —going from a triangulation to a set of cuts— to the reader.
Let be a complete set of pairwise parallel legal minimal cuts of . We assume that the elements of are ordered in some arbitrary, but fixed, manner, and that no proper subset of is also complete. For each , we build a pair (X F ,Y F ) where X F and Y F are vertex separators of , and X F ,Y F ⊆{u:u is the endpoint of some edge inF}. The collection of pairs is not unique, as it depends on the order in which is arranged. We say that a cut differentiates an internal edge e={x,y} if x∈X F and y∈Y F .
For each , let F i =E(T i )∩F for each i∈[k], and let denote the set of all edges e such that e∈F i for some i∈[k] with |F i |=1. Note that if |F i |>1, all edges in F i must share a common endpoint. Let A F and B F denote the two connected components of .
For each cut F in , we build (X F ,Y F ) as follows.
-
1.
For each internal edge :
-
(a)
If no cut preceding F differentiates e, add e∩V(A F ) to X F and e∩V(B F ) to Y F .
-
(b)
Otherwise, suppose cut , which precedes F, differentiates e. Let Q be the connected component of where E(Q)∩F≠∅. (Note that Q is unique, since I and F are parallel.) Let v be the unique endpoint of e in Q. Add v to X F and Y F .
-
(a)
-
2.
For each non-internal edge , add the non-leaf endpoint of e to both X F and Y F .
-
3.
For each i∈[k] such that |F i |>1, add the common endpoint of the edges of F i to both X F and Y F .
By construction and the properties of , every edge internal edge of is differentiated by some cut . Further, the sets X F and Y F have the form X F ={x1,…,x m ,z1,…,z p } and Y F ={y1,…,y m ,z1,…,z p }, where m>0, p≥0, and for every i∈[m], {x i ,y i } is an internal edge of that is differentiated by F. Let
We now state how to go from a complete set of pairwise parallel legal cuts to a legal triangulation. As in Section ‘Preliminaries’, given a graph G and a collection Δ of subsets of V(G), GΔ denotes the graph derived from G by making the set of vertices of X a clique for every X∈Δ.
Theorem 11.
Let Δ be the collection of subsets of given by
Then, is a legal triangulation of .
The proof of Theorem 11 relies on a series of auxiliary lemmas, for which we introduce some new notation. For each , F∪ denotes X F ∪Y F and F∩ denotes X F ∩Y F . Also, we abbreviate to GΔ, where Δ is the set defined in Equation (1)
Lemma 18.
Let F and I be two distinct cuts of , and let x be a vertex of F∪. Suppose x lies in the connected component of that does not contain edges of F. Then, x∈I∩.
Proof.
Let EF,x be the set of all edges of F that contain x and let EI,x be the set of all edges of I that contain x. We must have EF,x⊆EI,x⊆I. If |EI,x|>1, then x∈I∩. Thus, assume that |EI,x|=1. Let EI,x={e}, where e={x,y}. Since EF,x⊆EI,x and |EF,x|≥1, EF,x={e}. We can assume that y is not a leaf (since, otherwise, x∈I∩). Let EI,y be the set of edges of I with y as an endpoint. Vertex y lies in the component of G−F that does not contain I. Thus, every edge in EI,y is also present in F. If |EI,y|>1, then there is more than one edge in F with y as an endpoint and by construction, x∉F∪. Hence, |EI,y|=1, and so EI,y={e}.
Let J be the cut that differentiates e. If F=J then by construction, x∈I∩. Thus, assume that F≠J. If J is in the same connected component of as I, then, by construction x∉F∪, which is a contradiction. Thus, J is in the connected component of that does not contain I and, by construction, x∈I∩.
Lemma 19.
Let . For every edge {u,v} in GΔ, (i) if u∈V(A F )∖F∩, then v∉V(B F )∖Y F , and (ii) if u∈V(B F )∖F∩, then v∉V(A F )∖X F .
Proof.
Without loss of generality, we consider only the case where u∈V(A F )∖F∩. Suppose that v∈V(B F )∖Y F . If , then e∈F and hence, by construction, at least one of u and v is in F∪. But v∉Y F , so u∈F∩, a contradiction.
Thus, e must be a fill-in edge. Since e ⊈ F∪, there must be a cut , I≠F, such that e⊆I∪. If E(A F )∩I≠∅, then by Lemma 18, v∈F∩, a contradiction. Thus, assume that E(B F )∩I≠∅. Then, by Lemma 18, u∈F∩, another contradiction
A clique of GΔ is illegal if it contains a fill-in edge with a leaf as an endpoint or it contains an internal edge along with any another edge of . An illegal clique violates one of the legal triangulation conditions (LT1) or (LT2) stated in Section ‘Display graphs and edge label intersection graphs’.
Lemma 20.
Let F be a cut of and let H be the subgraph of GΔ induced by vertices of F∪. Then, H is triangulated and contains no illegal clique.
Proof.
Let X F ={x1,…,x m ,z1,…,z p } and Y F ={y1,…,y m ,z1,…,z p }, where for every i∈[m], x i ∈V(A F ), y i ∈V(B F ) and {x i ,y i } is an internal edge of . Note that F∩={z1,…,z p }.
Claim. For every i,j∈[m] where i>j, e={x i ,y j }∉E(H ).
Proof.
Assume that e∈E(H). By construction of (X F ,Y F ), e is a fill-in edge. Since no set in O F contains both x i and y j , there is a cut where e⊆I∪. Since F and I are parallel, only one of the two sets I∩E(A F ) or I∩E(B F ) is non-empty. Assume that I∩E(A F )≠∅. Then by Lemma 18, y j ∈F∩, a contradiction. Similarly, if I∩E(B F )≠∅, then by Lemma 18, x i ∈F∩, a contradiction.
Let C be a chordless cycle of length at least four in H. Since X F and Y F are cliques in GΔ, if C contains more than two vertices from one of X F or Y F , then C must contain a chord. Hence, C has exactly four vertices, with exactly two vertices each from X F and Y F . We will first show that z i ∉C for any i∈p. Assume that z i ∈C for some i∈[p]. Then, of the remaining three vertices of C, at least two of them belong to one of X F and Y F . Let a, b be those two vertices. Without loss of generality assume that {a,b}⊆X F . Since, F∩⊆X F , vertices z i , a, b form a clique in H. Thus, C is not chordless, a contradiction.
Let x i ,x j be the vertices of X F in C where 1≤i<j≤m. Similarly, let be the vertices of Y F in C where 1≤i′<j′≤m. Now, either i≤i′ or i>i′. If i≤i′, then {x1,…,x i ,y i ,…,y m ,z1,…,z p }∈O F and thus vertices form a clique. Hence, C is not chordless, a contradiction. If i>i′, then from the above claim neither of the edges and can exist. Thus, vertex cannot be in C, a contradiction. Hence, H does not contain a chordless cycle and is triangulated.
Assume that H contains an illegal clique H′; that is, H′ contains two internal edges e and e′. By construction, F∪ cannot contain a leaf. By the legality of F and the construction of F∪, edges e and e′ are from different input trees and both are differentiated by F. Let e={x i ,y i } for some i∈[m] and let e′={x j ,y j } for some j∈[m]. Without loss of generality, assume that i<j. By the above claim, there is no edge between x j and y i in H; thus, H′ is not a clique, a contradiction. □
Lemma 21.
GΔ is chordal.
Proof.
Assume the contrary. Let C be a chordless cycle of length at least four in GΔ. By construction, C cannot contain a leaf. There are two cases. Case 1: There are vertices u,v∈V(C) and a cut where u∈X F ∖F∩ and v∈Y F ∖F∩.
We have two subcases. Case 2: There is no cut with vertices u∈X F ∖F∩ and v∈Y F ∖F∩ such that u,v∈V(C ). Thus, for every cut at most two vertices of V(C) are in F∪. Let x1,x2,x3,x4 be a path of length four in C. For every i∈{1,2,3}, let be the cut where . We will first show that such cuts exist and are distinct.
-
(a)
Suppose C contains a vertex x∈F ∩ . Then, there exists a path u,x,v in C. Because C is a cycle, there must exist an edge between a vertex u ′∈V(A F )∖x and v ′∈V(B F )∖x. Since C is chordless, u ′∉F ∩ and v ′∉F ∩. Thus, u ′∈V(A F )∖F ∩ and v ′∈V(B F )∖F ∩. By Lemma 19, if u ′∈V(A F )∖X F then there is no edge between u ′ and v ′. Thus, u ′∈X F ∖F ∩. Similarly, v ′∈Y F ∖F ∩. If u≠u ′ or v≠v ′, C cannot be chordless. Thus, u=u ′ and v=v ′ and C has length three, a contradiction
-
(b)
Suppose C does not contain a vertex of F ∩ . Since u∈V(A F )∖F ∩, v∈V(B F )∖F ∩ and F is a cut, there must exist two edges e 1={x 1,y 1} and e 2={x 2,y 2} in C where {x 1,x 2}⊆V(A F )∖F ∩ and {y 1,y 2}⊆V(B F )∖F ∩. If x 1∈V(A F )∖X F , then by Lemma 19 there cannot exist an edge between x 1 and y 1. Thus, x 1∈X F ∖F ∩. Similarly, x 2∈X F ∖F ∩ and {y 1,y 2}⊆Y F ∖F ∩. Since X F and Y F are cliques in G Δ, there exist edges {x 1,x 2} and {y 1,y 2}. Thus, there cannot exist any other vertex in C and hence V(C)⊆F ∪. But, by Lemma 20 subgraph of G Δ induced by vertices of F ∪ is triangulated. Thus, C is not chordless, a contradiction
Recall that every vertex in C is internal. Also, C does not contain any edge e={x,y} from G; otherwise, there would be a cut F′ that differentiates e, contradicting the assumption for case 2. Since every edge in C is in GΔ, it must be the case that for every edge e in C there exists a cut F where e⊆F∪. Also, at most two vertices of C are in F∪. Thus the cuts F(1), F(2) and F(3) are distinct.
To simplify notation, for each i∈{1,2,3} let and . Without loss of generality, assume that E(A1)∩F(2)≠∅ and E(B2)∩F(1)≠∅. There are three possibilities.
-
(a)
Suppose F (3)∩E(A 2)≠∅. If x 1∈A 2, then by Lemma 18, and C is not chordless, a contradiction. Thus, x 1∈B 2. Similarly, if x 4∈B 2, by Lemma 18, and C is not chordless, a contradiction. Thus, x 4∈A 2. Since C is a cycle, F (2) is a minimal cut and , there exists an edge {v 1,v 2} in C where and . But, by Lemma 19, such an edge cannot exist.
-
(b)
Suppose F (3)∩E(A 1)≠∅ and F (3)∩E(B 2)≠∅. Without loss of generality, assume that A 3, B 3 contain F (2) and F (1) respectively. Assume that x 2∈A 3. Since , by Lemma 18, . Then, there exists an edge {x 2,x 4} and C is not chordless, a contradiction. Thus, x 2∈B 3. But and thus, by Lemma 18, . Hence, there exists a chord {x 2,x 4} and C is not chordless, again a contradiction.
-
(c)
Suppose F (3)∩E(B 1)≠∅. Renaming vertices x 1, x 2, x 3 and x 4 as, x 4, x 3, x 2 and x 1, respectively, brings us back to subcase 2(b).
Thus, GΔ does not contain a chordless cycle of length four or greater; hence, GΔ is chordal.
Proof of Theorem 11.
Lemma 21 states that GΔ is triangulated. We now prove that GΔ is a legal triangulation; i.e., that it satisfies conditions (LT1) and (LT2) of Section ‘Display graphs and edge label intersection graphs’
Condition (LT2) holds for GΔ, because our construction adds no fill-in edge incident on a leaf. Now suppose that GΔ violates (LT1); i.e., GΔ has a clique H with two internal edges e={x1,y1} and e′={x2,y2}. Let F be the cut that differentiates e. Assume that x1∈V(A F ) and y1∈V(B F ). By Lemma 20, F∪ does not contain both endpoints of e′. Without loss of generality, assume that x2∉F∪ and x2∈A. Since x2∉F∪ and y1∉F∩, by Lemma 19, there is no edge between x2 and y1 in GΔ. Thus, H is not a clique of GΔ, a contradiction. Hence, GΔ satisfies (LT1) and is therefore a legal triangulation of .
Conclusion
We have shown that the characterization of tree compatibility in terms of restricted triangulations of the edge label intersection graph transforms into a characterization in terms of minimal cuts in the display graph. These two characterizations are closely related to the legal triangulation characterization of [8]. We also derived characterizations of the agreement supertree problem in terms of minimal cuts and minimal separators of the display and edge label intersection graphs respectively.
It remains to be seen whether any of our characterizations can lead to explicit fixed-parameter algorithms for the tree compatibility and agreement supertree problems when parametrized by the number of trees. Indeed, as of yet, the fixed-parameter tractability of agreement remains open.
We close with some remarks on characterizations of two problems related to compatibility. A profile defines a tree S if S is the only compatible supertree for . identifies a tree S if S is a compatible supertree for and every other compatible supertree for displays S. Grunewald et al. [16] use quartet graphs to characterize when a profile consisting of quartet trees defines or identifies a tree. An interesting question is whether similar characterizations can be derived for arbitrary profiles using display graphs or edge label intersection graphs. Along these lines, we note a connection between complete sets of cuts and the question of whether a profile defines a tree, which was pointed out by one of the reviewers. To explain it, we need some definitions ([10], p. 131). Let T be a tree and let q=x y|w z be a quartet tree displayed by T. Quartet tree q distinguishes an interior edge e of T if e is the only interior edge such that {x,y} and {w,z} are in different connected components of T−e. Now, let S and T be two trees such that S displays T. An interior edge e of T distinguishes an interior edge f of S if there exists a quartet q such that e and f are both distinguished by q. Suppose is a profile in which there is at least one taxon in common among all input trees. Then, defines a tree S if and only if is compatible and every interior edge of S is distinguished by an interior edge of at least one tree in ([10], p. 133). Now, recall that if is a complete set of cuts of , then, for every tree and every internal edge e of T i , there is some cut in which e is the only edge of T i . Thus, if is compatible, e must be a distinguishing edge for some internal edge of a supertree for . This observation could lead to a cut-based characterization of definability analogous to known triangulation-based characterizations (see [10], p. 79).
References
Gordon AD: Consensus supertrees: the synthesis of rooted trees containing overlapping sets of labelled leaves. J Classif. 1986, 9: 335-348.
Aho A, Sagiv Y, Szymanski T, Ullman J: Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J Comput. 1981, 10 (3): 405-421. 10.1137/0210030.
Ng M, Wormald N: Reconstruction of rooted trees from subtrees. Discrete Appl Math. 1996, 69 (1–2): 19-31.
Steel MA: The complexity of reconstructing trees from qualitative characters and subtrees. J Classif. 1992, 9: 91-116. 10.1007/BF02618470.
Bryant D, Lagergren J: Compatibility of unrooted phylogenetic trees is FPT. Theor Comput Sci. 2006, 351: 296-302. 10.1016/j.tcs.2005.10.033.
Courcelle B: The monadic second-order logic of graphs I, Recognizable sets of finite graphs. Inf Comput. 1990, 85: 12-75. 10.1016/0890-5401(90)90043-H.
Arnborg S, Lagergren J, Seese D: Easy problems for tree-decomposable graphs. J Algorithms. 1991, 12 (2): 308-340. 10.1016/0196-6774(91)90006-K.
Vakati S, Fernández-Baca D: Graph triangulations and the compatibility of unrooted phylogenetic trees. Appl Math Lett. 2011, 24 (5): 719-723. 10.1016/j.aml.2010.12.015.
Gysel R, Stevens K, Gusfield D: Reducing problems in unrooted tree compatibility to restricted triangulations of intersection graphs. Algorithms in Bioinformatics – 12th International Workshop, WABI 2012 Ljubljana, Slovenia, September 10–12, 2012. Proceedings, Volume 7534 of Lecture Notes in Computer Science. Edited by: Raphael BJ, Tang J. 2012, 93-105. Heidelberg: Springer
Semple C, Steel M: Phylogenetics. 2003, Oxford Lecture Series in Mathematics, Oxford: Oxford University Presss
Buneman P: The recovery of trees from measures of dissimilarity. Mathematics in the Archaeological and Historical Sciences. 1971, 387-395. Edinburgh: Edinburgh University Press
Parra A, Scheffler P: Characterizations and algorithmic applications of chordal graph embeddings. Discrete Appl Math. 1997, 79 (1–3): 171-188.
Todinca I, : Treewidth and minimum fill-in: grouping the minimal separators. SIAM J Comput. 2001, 31: 212-232. 10.1137/S0097539799359683.
Heggernes P: Minimal triangulations of graphs: a survey. Discrete Math. 2006, 306 (3): 297-317. 10.1016/j.disc.2005.12.003.
Gusfield D: The multi-state perfect phylogeny problem with missing and removable data: solutions via integer-programming and chordal graph theory. J Comput Biol. 2010, 17 (3): 383-399.
Grunewald S, Humphries PJ, Semple C: Quartet compatibility and the quartet graph. Electron J Comb. 2008, 15: R103.
Acknowledgements
We thank Sylvain Guillemot for his valuable comments. We are also grateful to the reviewers for providing constructive criticism. This work was supported in part by the National Science Foundation under grants CCF-1017189 and DEB-0829674.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
SV stated and proved the main results of the paper and wrote most of the first draft. DFB proposed the research topic to SV, supervised the research, contributed to the first draft, and was in charge of the final draft. Both authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Vakati, S., Fernández-Baca, D. Characterizing compatibility and agreement of unrooted trees via cuts in graphs. Algorithms Mol Biol 9, 13 (2014). https://doi.org/10.1186/1748-7188-9-13
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1748-7188-9-13