Skip to main content
Fig. 3 | Algorithms for Molecular Biology

Fig. 3

From: Fast characterization of segmental duplication structure in multiple genome assemblies

Fig. 3

k-mer chaining-based SD decomposition applied on the example from Fig. 2. Top: after data pre-processing, we end up with three sequences (chrA, chrB_1, and chrB_2) that are scanned from left to right to find identical regions that share common k-mers. The first matching region is the green region in chrA that matches the same-colored region in chrB_1. Middle: after encountering the yellow region (b), the algorithm marks a new elementary SD because the number of yellow regions does not match the number of green regions; therefore, the green regions will be reported as instances of a separate elementary SD. Bottom: if no k-mer can be appended to any of the elementary SDs in L, the algorithm will report all regions that are larger than \(\mu\) as one elementary SD and discard the others. Here, the regions numbered as 2, 3, and 5 do not continue into the blue regions and thus prevent the further extension of the pink region

Back to article page