Skip to main content

Advertisement

Table 4 Accuracies on large data sets

From: Towards a practical O(n logn) phylogeny algorithm

   # of sequences  
  20,000 40,000 78,132
weighted majority, 5 quartets 80.8 ±1.1 78.9 ±1.4 80.8 ±1.1
weighted majority, 5 quartets, all taxa forced 79.9 ±1.1 77.8 ±1.4 75.3 ±1.6
weighted majority, 5 quartets+local search 96.1 94.4 92.8
weighted majority, 5 quartets, all taxa forced+local search 95.8 93.8 92.0
FastTree (NJ phase only) 62.9 58.1 52.2
FastTree + local search 95.8 93.8 92.0
  1. Robinson-Foulds accuracies of FastTree and the random walk algorithm for the huge.1 data set[9]. The figures for the random walk algorithm represent the average accuracy over 10 runs of the algorithm, together with empirical standard deviations. We used a confidence threshold, with two additional rounds of insertions. The average taxon coverage for weighted majority was 98.6, 98.6, and 98.0 per cent for the 20,000, 40,000, and 78,132 taxa alignments, respectively. After applying local search, the variance between the runs of the random walk algorithm is negligible.