Skip to main content

Table 5 Support estimation method performance on main model conditions

From: Non-parametric and semi-parametric support estimation using SEquential RESampling random walks on biomolecular sequences

Model conditionPR-AUC (%)Pairwise t-test corrected q-valueROC-AUC (%)DeLong et al. test corrected q-value
10.A88.7491.17\(5.4 \times 10^{-7}\)80.2285.57\(<10^{-10}\)
10.B82.2186.26\(1.5 \times 10^{-6}\)84.8388.66\(<10^{-10}\)
10.C76.2383.49\(1.9 \times 10^{-4}\)86.9891.23\(<10^{-10}\)
10.D74.6585.81\(1.9 \times 10^{-4}\)88.5593.72\(<10^{-10}\)
10.E42.6159.20\(3.1 \times 10^{-4}\)82.2487.40\(<10^{-10}\)
50.A98.2298.92\(5.3 \times 10^{-10}\)83.0990.64\(<10^{-10}\)
50.B97.8498.69\(2.8 \times 10^{-9}\)82.8590.39\(<10^{-10}\)
50.C95.0896.80\(5.6 \times 10^{-8}\)85.5490.64\(<10^{-10}\)
50.D90.7995.75\(5.3 \times 10^{-6}\)88.8994.56\(<10^{-10}\)
50.E62.4779.14\(8.0 \times 10^{-10}\)91.0293.23\(<10^{-10}\)
Model conditionPR-AUC (%)Pairwise t-test corrected q-valueROC-AUC (%)DeLong et al. test corrected q-value
10.A92.5593.33\(7.4 \times 10^{-6}\)87.1788.34\(<10^{-10}\)
10.B88.0889.31\(8.4 \times 10^{-4}\)89.4590.56\(<10^{-10}\)
10.C84.2886.86\(3.1 \times 10^{-4}\)91.3692.88\(<10^{-10}\)
10.D86.0388.75\(1.9 \times 10^{-4}\)93.3494.69\(<10^{-10}\)
10.E51.1762.30\(1.3 \times 10^{-3}\)86.0088.28\(<10^{-10}\)
50.A98.9899.14\(5.3 \times 10^{-6}\)91.1792.50\(<10^{-10}\)
50.B98.7998.96\(1.5 \times 10^{-6}\)91.2492.44\(<10^{-10}\)
50.C96.8697.45\(3.2 \times 10^{-7}\)90.8192.31\(<10^{-10}\)
50.D94.0496.23\(1.5 \times 10^{-5}\)92.6795.09\(<10^{-10}\)
50.E72.6181.47\(1.5 \times 10^{-8}\)92.9494.22\(<10^{-10}\)
  1. Results are shown for five 10-taxon model conditions (named 10.A through 10.E in order of generally increasing sequence divergence) and five 50-taxon model conditions (similarly named 50.A through 50.E). We evaluated the performance of two state-of-the-art methods for MSA support estimation—GUIDANCE1 [18] and GUIDANCE2 [20]—versus re-estimation on SERES and parametrically resampled replicates (using parametric techniques from either GUIDANCE1 or GUIDANCE2) (see “Methods” section for details.) We calculated each method’s precision-recall (PR) and receiver operating characteristic (ROC) curves. Performance is evaluated based upon aggregate area under curve (AUC) across all replicates for a model condition (\(n=20\)). The top rows show AUC comparisons of GUIDANCE1 (“GUIDANCE1”) vs. SERES combined with parametric techniques from GUIDANCE1 (“SERES + GUIDANCE1”), and the bottom rows show AUC comparisons of GUIDANCE2 (“GUIDANCE2”) vs. SERES combined with parametric techniques from GUIDANCE2 (“SERES + GUIDANCE2”); for each model condition and pairwise comparison, the best AUC is shown in italics. Statistical significance of PR-AUC or ROC-AUC differences was assessed using a one-tailed pairwise t-test or DeLong et al. [5] test, respectively, and multiple test correction was performed using the method of Benjamini and Hochberg [1]. Corrected q-values are reported (\(n=20\)) and all were significant (\(\alpha =0.05\))