Skip to main content

Table 1 Index data structures of the string ACTACGTACGTACG$

From: A representation of a compressed de Bruijn graph for pan-genome analysis that enables search

i

\(\mathsf {SA}\)

\(\mathsf {LCP}\)

\(B_{r}\)

\(B_{l}\)

\(\mathsf {LF}\)

\(\Psi \)

\(\mathsf {BWT}\)

\(S_{\mathsf {SA}[i]}\)

1

15

−1

0

0

10

5

G

$

2

12

0

1

0

13

6

T

ACG$

3

8

3

0

0

14

7

T

ACGTACG$

4

4

7

1

0

15

8

T

ACGTACGTACG$

5

1

2

0

0

1

9

$

ACTACGTACGTACG$

6

13

0

0

0

2

10

A

CG$

7

9

2

0

0

3

11

A

CGTACG$

8

5

6

0

0

4

12

A

CGTACGTACG$

9

2

1

0

1

5

15

A

CTACGTACGTACG$

10

14

0

0

0

6

1

C

G$

11

10

1

0

0

7

13

C

GTACG$

12

6

5

0

1

8

14

C

GTACGTACG$

13

11

0

0

0

11

2

G

TACG$

14

7

4

0

0

12

3

G

TACGTACG$

15

3

8

0

0

9

4

C

TACGTACGTACG$

16

 

−1

      
  1. The suffix array \(\mathsf {SA}\) of the string ACTACGTACGTACG$ and related notions are defined in section "Preliminaries". The bit vectors \(B_r\) and \(B_l\) for \(k=3\) are explained in section “Computation of right-maximal k-mers and node identifiers