Skip to main content

Advertisement

Table 2 Comparison of assembly accuracy in the first three scenarios

From: GAML: genome assembly by maximum likelihood

Assembler Number of scaffolds Longest scaffold (kb) Longest scaffold corr. (kb) N50 (kb) Err. N50 corr. (kb) LAP
Staphylococus aureus, read sets SA1, SA2
 GAML 28 1,191 1,191 514 0 514 −23.45
 Allpaths-LG 12 1,435 1,435 1,092 0 1,092 −25.02
 SOAPdenovo 99 518 518 332 0 332 −25.03
 Velvet 45 958 532 762 17 126 −25.34
 Bambus2 17 1,426 1,426 1,084 0 1,084 −25.73
 MSR-CA 17 2,411 1,343 2,414 3 1,022 −26.26
 ABySS 246 125 125 34 1 28 −29.43
 Cons. Velvet* 219 95 95 31 0 31 −30.82
 SGA 456 286 286 208 1 208 −31.80
Escherichia coli, read sets EC1, EC2
 PacbioToCA 55 1,533 1,533 957 0 957 −33.86
 GAML 29 1,283 1,283 653 0 653 −33.91
 Cerulean 21 1,991 1,991 694 0 694 −34.18
 AHA 54 477 477 213 5 194 −34.52
 Cons. Velvet* 383 80 80 21 0 21 −36.02
Escherichia coli, read sets EC1, EC2, EC3
 GAML 4 4,662 4,661 4,662 3 4,661 −60.38
 Celera 19 4,635 2,085 4,635 19 2,085 −61.47
 Cons. Velvet* 383 80 80 21 0 21 −72.03
  1. For all assemblies, N50 values are based on the actual genome size. All misjoins were considered as errors and error-corrected values of N50 and contig sizes were obtained by breaking each contig at each error [24]. All assemblies except for GAML and conservative Velvet were obtained from [24] in the first experiment, and from [6] in the second experiment.
  2. Italic numbers in each column signify the best result.
  3. * Velvet with conservative settings used to create the assembly graph in our method.