Skip to main content

Table 4 Experiments on (references of) pangenomes with k = 32 and a min abundance of 1

From: Eulertigs: minimum plain text representation of k-mer sets without repetitions in linear time

Pangenome

Tigs

CL ratio

SC ratio

Time [s]

Memory [MiB]

1102x N. gonorrhoeae

Unitigs

1.623

3.053

37.1

 

6725

 

UST

1.023

1.074

39.3

(1.06)

6725

(1.00)

ProphAsm

1.00005

1.00015

764

(20.6)

210

(0.03)

Eulertigs

1

1

38.3

(1.03)

6725

(1.00)

616x S. pneumoniae

Unitigs

1.685

3.050

37.8

 

4036

 

UST

1.026

1.079

42.2

(1.12)

4036

(1.00)

ProphAsm

1.00005

1.00014

446

(11.8)

439

(0.11)

Eulertigs

1

1

41.3

(1.09)

4036

(1.00)

3682x E. coli

Unitigs

1.710

3.089

457

 

7193

 

UST

1.031

1.092

542

(1.18)

7193

(1.00)

ProphAsm

1.00006

1.00018

7148

(15.6)

7318

(1.02)

Eulertigs

1

1

521

(1.14)

7193

(1.00)

\(\sim\)309kx Salmonella

Unitigs

1.831

3.141

169935

 

13860

 

UST

1.048

1.124

170358

(1.00)

13860

(1.00)

Eulertigs

1

1

170248

(1.00)

13860

(1.00)

  1. The CL and SC ratios are compared to the CL-optimal Eulertigs. For time and memory, we report the total time and maximum memory required to compute the tigs from the respective data set. BCALM2 directly computes unitigs, while UST- and Eulertigs require a run of BCALM2 first before they can be computed themselves. Prophasm is run directly on the source data. The number in parentheses behind time and memory indicates the slowdown/increase over computing just unitigs with BCALM2. BCALM2 was run with 28 threads, while all other tools support only one thread. The N. gonorrhoeae pangenome contains 8.36 million unique kmers, the S. pneumoniae pangenome contains 19.3 million unique kmers, the E. coli pangenome contains 341 million unique kmers, the Salmonella pangenome contains 657 million unique kmers and the human pangenome contains 2.8 billion unique kmers. Due to its size, ProphAsm could not be run on the Salmonella pangenome. Also due to size, BCALM2 did not run on the human pangenome, hence we used Cuttlefish 2. To still be able to compare against competitors, we ran ProphAsm on the unitigs produced by Cuttlefish 2 (UST requires extra information specific to BCALM2) . Cuttlefish 2 supports only odd k, hence the human pangenome is excluded from this experiment