A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

Beller, Timo; Ohlebusch, Enno

doi:10.1186/s13015-016-0083-7

Algorithms for Molecular Biology

Table 2 Runtime and maximum main memory usage for the construction of the compressed de Bruijn graph

From: A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

k	Algorithm	40 E. coli	62 E. coli	7 × Chr1	7 × HG
init	SplitMEM	117 (315.25)	141 (317.00)	−	−
init	A1, A2	38 (5.00)	64 (5.00)	380 (5.00)	−
init	A3, A4	131 (1.32)	202 (1.24)	1168 (1.24)	20,341 (1.24)
50	SplitMEM	2261 (572.19)	−	−	−
50	A1	57 (5.22)	92 (5.34)	596 (6.20)	−
50	A2	61 (8.49)	97 (8.78)	619 (9.98)	−
50	A3	188 (2.23)	300 (2.26)	1733 (3.07)	29,816 (2.77)
50	A3compr1	208 (1.81)	346 (1.85)	1880 (2.66)	31,472 (2.36)
50	A3compr2	236 (1.63)	374 (1.66)	2318 (2.51)	39,366 (2.22)
50	A4	164 (1.75)	254 (1.82)	1419 (1.28)	25,574 (1.96)
50	A4compr1	167 (1.46)	257 (1.53)	1435 (1.28)	25,866 (1.66)
50	A4compr2	179 (1.32)	272 (1.24)	1526 (1.24)	27,365 (1.39)
50	A4+explicit	172 (3.26)	268 (3.35)	1515 (3.59)	27,619 (3.88)
50	A4compr1+explicit	176 (2.97)	271 (3.06)	1541 (3.31)	28,044 (3.64)
50	A4compr2+explicit	188 (2.66)	289 (2.74)	1629 (2.96)	29,517 (3.38)
100	SplitMEM	2568 (572.20)	−	−	−
100	A1	59 (5.00)	95 (5.00)	595 (5.95)	−
100	A2	62 (7.89)	99 (8.19)	605 (9.74)	−
100	A3	188 (1.63)	299 (1.68)	1738 (2.74)	27,815 (2.23)
100	A3compr1	205 (1.50)	326 (1.49)	1839 (2.33)	30,401 (1.80)
100	A3compr2	232 (1.32)	411 (1.29)	2340 (2.14)	38,134 (1.66)
100	A4	174 (1.71)	261 (1.79)	1422 (1.28)	25,723 (1.94)
100	A4compr1	171 (1.42)	264 (1.50)	1439 (1.28)	26,040 (1.64)
100	A4compr2	185 (1.32)	289 (1.24)	1544 (1.24)	27,464 (1.37)
100	A4+explicit	178 (2.61)	270 (2.73)	1486 (3.21)	26,878 (3.36)
100	A4compr1+explicit	175 (2.32)	273 (2.44)	1500 (2.92)	26,999 (3.07)
100	A4compr2+explicit	190 (2.01)	299 (2.12)	1624 (2.68)	28,665 (2.80)
500	SplitMEM	2116 (570.84)	−	−	−
500	A1	72 (5.00)	113 (5.00)	620 (5.83)	−
500	A2	83 (7.17)	117 (7.43)	640 (9.66)	−
500	A3	194 (1.50)	304 (1.49)	1752 (2.67)	28,548 (2.07)
500	A3compr1	216 (1.50)	325 (1.49)	1839 (2.19)	30,488 (1.65)
500	A3compr2	241 (1.32)	378 (1.29)	2319 (2.06)	36,993 (1.50)
500	A4	184 (1.65)	283 (1.74)	1453 (1.28)	26,362 (1.93)
500	A4compr1	197 (1.35)	287 (1.44)	1477 (1.28)	26,545 (1.63)
500	A4compr2	213 (1.32)	322 (1.24)	1622 (1.24)	28,501 (1.36)
500	A4+explicit	185 (1.81)	285 (1.90)	1509 (3.14)	27,285 (3.14)
500	A4compr1+explicit	198 (1.52)	288 (1.61)	1535 (2.83)	27,417 (2.79)
500	A4compr2+explicit	214 (1.32)	323 (1.29)	1694 (2.56)	29,283 (2.58)

The first column shows the k-mer size (an entry init means that only the index data structure is constructed) and the second column specifies the algorithm used in the experiment. The remaining columns show the run-times in seconds and, in parentheses, the maximum main memory usage in bytes per base pair (including the construction) for the data sets described in the text. A minus indicates that the respective algorithm was not able to solve its task on our machine equipped with 128 GB of RAM

Back to article page

ISSN: 1748-7188

Contact us

General enquiries: journalsubmissions@springernature.com