Skip to main content


Fig. 5 | Algorithms for Molecular Biology

Fig. 5

From: Estimation of genetic diversity in viral populations from next generation sequencing data with extremely deep coverage

Fig. 5

Variational distance (vd) between the control data and an experimental condition along the genome. The variational distance per site is defined by \(vd = \left| {p_{A} - p_{A}^{\prime } } \right| + \left| {p_{T} - p_{T}^{\prime } } \right| + \left| {p_{C} - p_{C}^{\prime } } \right| + \left| {p_{G} - p_{G}^{\prime } } \right|\), where \(\left( {p_{A} ,p_{T} ,p_{C} ,p_{G} } \right)\) is the probability distribution per site in the control data and \(\left( {p_{A}^{\prime } ,p_{T}^{\prime } ,p_{C}^{\prime } ,p_{G}^{\prime } } \right)\) is the probability distribution of the corresponding site in the experimental condition. The horizontal axis shows the sites of the genome (with the LTR regions removed) and the vertical axis shows the corresponding variational distances. Applying the conservative cut-off value of 0.04 for vd one obtains the sites with significant variation

Back to article page