Towards a practical O(n logn) phylogeny algorithm

Table 2 Comparison of different algorithms

					data set
	COG1011			COG840			COG1028
	250 sequences			1250 sequences			5000 sequences
method	% taxa	RF	QA	% taxa	RF	QA	% taxa	RF	QA
weighted majority, 5 quartets	88.5	66.2	72.8	84.1	57.4	85.8	74.4	51.4	59.5
WTA-vote, 5 quartets	86.0	69.4	70.8	80.2	62.3	85.4	70.1	57.0	59.6
weighted majority, 20 quartets	95.6	60.4	69.9	96.4	50.6	83.9	94.3	41.1	56.5
WTA-vote, 20 quartets	94.0	69.4	73.4	92.1	60.8	83.1	89.7	57.6	55.3
NJ	100	73.6	70.0	100	62.6	88.0	100	73.0	66.3
FastTree (NJ phase only)	100	69.7	85.9	100	61.0	86.6	100	73.6	66.4
weighted majority, 5 quartets, force all taxa	100	59.0	69.7	100	48.7	80.8	100	37.3	52.4

Performance of the random walk algorithm on synthetic alignments. The size of the guide tree was 200 except for the 250 taxon data set, where it was set to 100. The average RF accuracy of the guide trees was 65,50, and 46% for the 250,1250, and 5000-taxon data sets, respectively. The average quartet accuracies for the guide trees were 73,83 and 55%. When all taxa are forced into the random walk tree (see text), the RF accuracy decreases by 7 − 14%, depending on the data set. All random walk runs use the confidence threshold heuristic and two additional rounds of insertions.

ISSN: 1748-7188