Skip to main content
Fig. 2 | Algorithms for Molecular Biology

Fig. 2

From: Alignment- and reference-free phylogenomics with colored de Bruijn graphs

Fig. 2

Toy examples for different mutations to illustrate their effect on a C-DBG. Consider four genomes abc and d and k-mer length \(k=3\). Each vertex of the C-DBG is labeled with both its k-mer and the reverse complement (in arbitrary order), as well as its color set. Due to the small value of k, the C-DBG contains edges corresponding to pairs of overlapping k-mers that are not contained in the given strings. For the purpose of clarity, these are not drawn. Mutations are highlighted in bold and/or italics. a Single nucleotide variation in genomes \(a=b=\;\)AACGCAA and \(c=d=\;\)AACTCAA. The induced ordered split \(\{a,b\}\) and its inverse \(\{c,d\}\) of weight \(k=3\) each yield a corresponding unordered split \(\left\{ \{a,b\},\{c,d\}\right\}\) of weight \(\sqrt{k\,k}=k=3\). b Insertion/deletion of length \(l=4\) (or longer, indicated by dots) in genomes \(a=b=\;\)AACGG\(\,\cdots\)CACAA and \(c=d=\;\)AACCAA. The induced ordered split \(\{a,b\}\) of weight \(l+k-1 = l+2\) and its inverse \(\{c,d\}\) of constant weight \(k-1 = 2\) yield a corresponding unordered split \(\left\{ \{a,b\},\{c,d\}\right\}\) of weight \(\sqrt{(l+k-1)\,(k-1)}=\sqrt{2(l+2)}\). c Inversion of length \(l=4\) (or longer, indicated by dots) between genomes \(a=b=\;\)AACGG\(\,\cdots\)CACAA and \(c=d=\;\)AACTG\(\,\cdots\)CCCAA. The induced ordered split \(\{a,b\}\) and its inverse \(\{c,d\}\) of constant weight \(2(k-1) = 4\) each yield a corresponding unordered split \(\left\{ \{a,b\},\{c,d\}\right\}\) of constant weight \(\sqrt{2(k-1)\,2(k-1)}=2(k-1)=4\). d Lateral gene transfer of length \(l=4\) (or longer, indicated by dots) from genome \(a=\;\)AGG\(\,\cdots\)CAG to \(b=\;\)AACGG\(\,\cdots\)CACAA but not to \(c=d=\;\)AACCAA. Apart from mutation-independent splits for the boundaries, and the trivial split \(\{b\}\) (without its inverse), the split \(\{a,b\}\) of weight \(l-k+1=l-2\) and its inverse \(\{c,d\}\) of constant length \(k-1=2\) are induced, yielding a corresponding unordered split \(\left\{ \{a,b\},\{c,d\}\right\}\) of weight \(\sqrt{(l-k+1)\,(k-1)}=\sqrt{2(l-2)}\)

Back to article page