Skip to main content

Table 3 Space in bytes per input base pair for the explicit and the implicit representation of the compressed de Bruijn graph

From: A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

k

ds

40 E. coli

62 E. coli

7 × Chr1

7 × HG

50

Explicit

1.80

1.89

2.80

2.57

50

Implicit

0.84

0.82

0.77

0.76

50

Implicit-c1

0.55

0.53

0.47

0.47

50

Implicit-c2

0.30

0.27

0.25

0.26

100

Explicit

1.46

1.51

2.55

2.12

100

Implicit

0.80

0.79

0.75

0.74

100

Implicit-c1

0.51

0.50

0.46

0.45

100

Implicit-c2

0.26

0.24

0.23

0.24

500

Explicit

1.07

1.08

2.50

2.01

500

Implicit

0.74

0.74

0.75

0.74

500

Implicit-c1

0.44

0.44

0.45

0.44

500

Implicit-c2

0.20

0.18

0.23

0.23

  1. The numbers for the explicit representation include the input and the numbers for the implicit representation include the \(\mathsf {BWT}\) stored in a wavelet tree. The suffix -c1 means that the bit vectors \(B_{l}\) and \(B_{r}\) of the implicit representation are compressed, and the suffix -c2 means that additionally the (bit vectors in the) wavelet tree are compressed