Skip to main content

Table 1 Dataset characteristics

From: Disk compression of k-mer sets

Dataset

Source

Read length (bp)

# Reads

# Distinct 31-mers

# unitigs

% Dead-end unitigs (%)

% Isolated unitigs (%)

R. sphaeroides

GAGE [37]

101

2,050,868

5,908,467

442,681

47

8

Human RNA-seq

SRR957915

101

49,459,840

101,017,526

7,665,682

4

13

Gingiva metagenome

SRS014473

101

55,419,548

101,872,420

5,678,516

36

15

Soybean RNA-seq

SRR11458718

125

83,594,116

111,206,789

3,659,969

28

12

Tongue metagenome

SRS011086

101

81,664,789

165,159,726

11,358,233

37

11

Whole human

ERR174310

101

207,579,467

2,319,022,432

51,094,913

14

18