Skip to main content

Table 2 Data sets (Additional file 1)

From: Gerbil: a fast and memory-efficient k-mer counter with GPU-support

Data set

Format

Size (GB)

\(\varnothing\) Read length

# 28-mers

# Distinct 28-mers

Ratio (%)

F. vesca

FASTQ

10.2

352.1

4,134,078,256

632,436,468

15

M . balbisiana

FASTQ

98.6

100.0

20,531,572,597

965,691,662

4

G. gallus

FASTQ

115.9

100.0

25,337,974,831

2,727,529,829

11

H. sapiens

FASTQ

223.3

100.0

62,739,461,708

6,336,805,684

10

H. sapiens 2

FASTQ

339.5

100.0

98,892,620,173

6,634,382,141

7

GRCh38

FASTA

100.0

1000.0

97,300,000,000

1,802,953,276

2

N. crassa

FASTA

23.3

7778.3

22,808,741,626

21,769,513,655

95

A. thaliana

FASTQ

72.7

4804.6

35,905,278,785

32,894,281,429

92

  1. The rightmost column ’Ratio‘ describes the ratio between the number of distinct 28-mers and the total number of 28-mers