Skip to main content

Table 2 Long-gap-length model conditions: parameter values and summary statistics

From: Non-parametric and semi-parametric support estimation using SEquential RESampling random walks on biomolecular sequences

Model conditionTree heightInsertion/deletion probabilityNHDGappinessTrue align lengthEst align lengthSP-FNSP-FP
10.long.A0.40.130.2760.4401804.81433.70.2720.315
10.long.B0.70.10.3630.4811926.71447.80.3810.426
10.long.C10.060.4550.4561853.51413.30.5100.537
10.long.D1.60.0310.5420.4321754.11403.10.7250.729
10.long.E4.30.0130.6600.4451811.01560.10.8990.897
  1. Our simulation study included additional 10-taxon model conditions that utilized the long gap length distribution from the study of Liu et al. [15]. The model parameters consisted of model tree height and insertion/deletion probability, and each model condition corresponds to a distinct set of model parameter values. The long-gap-length model conditions are named 10.long.A through 10.long.E in order of generally increasing sequence divergence. The following table columns list average summary statistics for each model condition (\(n=20\)). “NHD” is the average normalized Hamming distance of a pair of aligned sequences in the true alignment. “Gappiness” is the percentage of true alignment cells which consists of indels. “True align length” is the length of the true alignment. “Est align length” is the length of the MAFFT-estimated alignment [9] which was provided as input to the support estimation methods. “SP-FN” and “SP-FP” are the proportion of homologies that appear in the true alignment but not in the MAFFT-estimated alignment and vice versa, respectively