From: Generalized enhanced suffix array construction in external memory
Dataset | Size (GB) | Number of strings | Total length | Avg. length | Max. lcp | Avg. lcp |
---|---|---|---|---|---|---|
dna | 9.85 | 153 | 10,580,043,054 | 69,150,608 | 2,282,187 | 1122 |
protein | 18.68 | 62,148,086 | 20,056,474,339 | 323 | 31,815 | 88 |
gutenberg | 22.32 | 407,864,056 | 23,962,356,903 | 59 | 11,946 | 18 |
enwiki | 24.50 | 351,363,467 | 25,648,226,940 | 75 | 111,273 | 33 |