Skip to main content

Table 3 Space in bytes per input base pair for the explicit and the implicit representation of the compressed de Bruijn graph

From: A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

k ds 40 E. coli 62 E. coli 7 × Chr1 7 × HG
50 Explicit 1.80 1.89 2.80 2.57
50 Implicit 0.84 0.82 0.77 0.76
50 Implicit-c1 0.55 0.53 0.47 0.47
50 Implicit-c2 0.30 0.27 0.25 0.26
100 Explicit 1.46 1.51 2.55 2.12
100 Implicit 0.80 0.79 0.75 0.74
100 Implicit-c1 0.51 0.50 0.46 0.45
100 Implicit-c2 0.26 0.24 0.23 0.24
500 Explicit 1.07 1.08 2.50 2.01
500 Implicit 0.74 0.74 0.75 0.74
500 Implicit-c1 0.44 0.44 0.45 0.44
500 Implicit-c2 0.20 0.18 0.23 0.23
  1. The numbers for the explicit representation include the input and the numbers for the implicit representation include the \(\mathsf {BWT}\) stored in a wavelet tree. The suffix -c1 means that the bit vectors \(B_{l}\) and \(B_{r}\) of the implicit representation are compressed, and the suffix -c2 means that additionally the (bit vectors in the) wavelet tree are compressed