Skip to main content

Table 1 Speed of genomic search and space usage

From: Bitpacking techniques for indexing genomes: II. Enhanced suffix arrays

 

Reverse search

Forward search

SA

CSA-Sada

SA-WT

CSA-WT

SA

ESA

ESA-bp

ESA-byte

ESA-gdi

Space used (bytes per genome length)

 Fly

3.5

0.9

4.9

0.9

3.5

11.4

8.0

6.6

7.1

 Chick

3.8

1.0

5.1

0.9

3.8

11.9

8.5

6.3

6.8

 Human

4.0

1.0

5.5

0.9

4.0

12.4

9.0

7.1

7.6

Counting task (microseconds per query)

 Fly

12-mers

42.9

45.6

1.6

1.7

46.8

2.3

52.6

5.2

1.7

24-mers

81.5

88.8

4.7

4.8

47.6

2.5

54.0

5.3

2.1

36-mers

119.1

130.0

5.9

7.0

42.5

2.8

54.6

5.6

2.2

 Chick

12-mers

52.4

44.5

2.0

2.1

54.7

2.8

61.4

7.3

2.3

24-mers

98.0

90.3

4.5

5.3

53.5

3.6

65.7

8.7

3.0

36-mers

148.1

137.4

9.0

8.2

54.5

3.8

67.1

7.9

3.3

 Human

12-mers

59.8

49.3

2.4

2.7

58.4

3.4

75.0

10.0

3.1

24-mers

101.5

94.5

5.8

5.6

66.8

5.1

76.7

12.1

4.0

36-mers

151.0

139.3

9.3

9.4

70.2

5.2

81.1

11.9

4.0

Locating task (microseconds per query)

 Fly

12-mers

49.6

223.8

21.5

147.8

52.0

3.5

52.2

6.7

2.9

24-mers

86.5

135.0

12.0

45.2

47.4

3.1

53.9

5.5

2.5

36-mers

130.2

157.6

12.9

32.1

51.3

3.1

59.3

6.5

2.4

 Chick

12-mers

62.0

682.7

45.8

628.3

61.5

5.2

63.9

10.7

4.3

24-mers

104.5

102.9

7.2

16.3

54.2

4.0

64.1

7.9

2.8

36-mers

141.0

134.5

8.8

13.5

51.7

3.8

65.6

7.7

3.3

 Human

12-mers

80.1

6300.8

602.6

5798.1

99.0

30.4

94.1

37.6

28.5

24-mers

101.4

393.6

40.8

264.0

68.1

6.8

77.4

13.8

5.5

36-mers

149.7

175.8

12.6

46.1

60.0

5.1

79.3

12.7

4.3

Locating task (nanoseconds per match result)

 Fly

12-mers

69.2

312.4

30.1

206.2

72.5

4.8

72.9

9.3

4.1

24-mers

360.5

562.8

50.2

188.6

197.8

13.0

224.6

23.1

10.3

36-mers

941.9

1140.1

93.8

232.7

371.5

22.8

429.3

47.0

17.3

 Chick

12-mers

36.7

404.1

27.1

371.9

36.4

3.1

37.8

6.3

2.5

24-mers

2993.0

2948.3

206.9

467.4

1551.5

115.8

1836.9

226.7

81.5

36-mers

9723.0

9274.2

606.4

928.8

3566.7

259.6

4527.3

533.2

226.3

 Human

12-mers

3.3

259.9

24.9

239.1

4.1

1.3

3.9

1.6

1.2

24-mers

71.0

275.6

28.6

184.8

47.7

4.7

54.2

9.7

3.9

36-mers

800.8

940.7

67.7

246.5

320.8

27.3

424.5

68.2

23.2

  1. Genome sources: Drosophila melanogaster version 5.25.64 (Fly), Gallus gallus gg4 (Chick), and Homo sapiens hg19 (Human)
  2. SA uncompressed suffix array; CSA-Sada compressed suffix array using Sadakane method; SA-WT and CSA-WT uncompressed and compressed suffix array, respectively, using wavelet tree. ESA enhanced suffix array, ESA-bp ESA with balanced parenthesis representation; ESA-byte ESA with bytecoding; ESA-gdi ESA with exception guide arrays, discriminating character array, and integrated data structure