Skip to main content
Fig. 1 | Algorithms for Molecular Biology

Fig. 1

From: Fast characterization of segmental duplication structure in multiple genome assemblies

Fig. 1

a A plane-sweep algorithm for finding putative SDs. b Visual guide for the algorithm. The algorithm sweeps a vertical dashed line through the set of winnowed k-mers in a genome G (x axis). At each k-mer starting at the location x, it queries the index \(I_G\) to obtain a sorted list K of k’s occurrences in G (right side of the sweep line). The algorithm then scans K, and the list L of putative SDs found thus far at the same time. At each step, it examines \(i_L\)-th element of L and \(i_K\)-th element of K, and decides whether to start a new putative SD [(1) and (1′), green k-mers on the right], extend the current putative SD with the current k-mer [(2), black k-mer on the right], or subsume the current k-mer within the current putative SD [(3), red k-mer]. c A visual representation of a valid k-mer matching in a valid alignment (shown by green lines). Red matching would render the alignment invalid as red matchings are not co-linear with the green matchings

Back to article page