Skip to main content

Table 1 Some basic statistics for the datasets used in the experiments, for \(k=31\), such as: number of distinct \(k\)-mers (n), number of distinct weights (\(|\mathcal {D}|\)), largest weight (max), expected weight value (E), and empirical entropy of the weights (\(H_0(W)\))

From: On weighted k-mer dictionaries

Dataset

n

\({|\mathcal {D}|}\)

\({\lceil \log _{2}|\mathcal {D}|\rceil }\)

\({max }\)

\({\lceil \log _{2}max \rceil }\)

E

\({H_{0}(W)}\)

E-Coli

5,235,781

22

5

27

5

1.05

0.206

S-Enterica-100

12,408,741

620

10

7956

13

38.94

4.155

Human-Chr-13

90,911,778

806

10

6354

13

1.08

0.160

C-Elegans

94,006,897

398

9

3478

12

1.07

0.223