Skip to main content

Table 6 Experiments on (references of) pangenomes with k = 64 and a min abundance of 1

From: Eulertigs: minimum plain text representation of k-mer sets without repetitions in linear time

Pangenome

Tigs

CL ratio

SC ratio

Time [s]

Memory [MiB]

1102x N. gonorrhoeae

Unitigs

1.805

3.026

57.3

 

6116

 

UST

1.028

1.069

59.4

(1.04)

6116

(1.00)

Eulertigs

1

1

59.2

(1.03)

6116

(1.00)

616x S. pneumoniae

Unitigs

1.767

3.008

42.4

 

5375

 

UST

1.026

1.068

46.9

(1.10)

5375

(1.00)

Eulertigs

1

1

47.0

(1.11)

5375

(1.00)

3682x E. coli

Unitigs

1.803

3.037

637

 

6897

 

UST

1.030

1.076

720

(1.13)

6897

(1.00)

Eulertigs

1

1

724

(1.14)

6897

(1.00)

\(\sim\)309kx Salmonella

Unitigs

1.873

3.021

202386

 

15580

 

UST

1.042

1.098

202838

(1.00)

15580

(1.00)

Eulertigs

1

1

202816

(1.00)

15580

(1.00)

  1. The CL and SC ratios are compared to the CL-optimal Eulertigs. For time and memory, we report the total time and maximum memory required to compute the tigs from the respective data set. BCALM2 directly computes unitigs, while UST- and Eulertigs require a run of BCALM2 first before they can be computed themselves. Prophasm is run directly on the source data. The number in parentheses behind time and memory indicates the slowdown/increase over computing just unitigs with BCALM2. BCALM2 was run with 28 threads, while all other tools support only one thread. The N. gonorrhoeae pangenome contains 8.36 million unique kmers, the S. pneumoniae pangenome contains 19.3 million unique kmers, the E. coli pangenome contains 341 million unique kmers, the Salmonella pangenome contains 657 million unique kmers and the human pangenome contains 2.8 billion unique kmers. Due to its size, ProphAsm could not be run on the Salmonella pangenome. Also due to size, BCALM2 did not run on the human pangenome, hence we used Cuttlefish 2. To still be able to compare against competitors, we ran ProphAsm on the unitigs produced by Cuttlefish 2 (UST requires extra information specific to BCALM2). Cuttlefish 2 supports only odd k, hence the human pangenome is excluded from this experiment. ProphAsm supports only \(k \le 32\), hence it is excluded from this experiment