Fig. 4 | Algorithms for Molecular Biology

From: Estimation of genetic diversity in viral populations from next generation sequencing data with extremely deep coverage

Histogram of the complementary probabilities of the control data. The complementary probability per site is defined as p comp = 1 − max{p A, p T, p C, p G} and it depends only on the probability distribution of each site. The horizontal axis shows the values of complementary probabilities and the vertical axis the proportions of sites. The histogram contains the sites with p comp < 0.02, which comprises 98.5 % of all genome. These are the sites that have a unique dominant nucleotide with probability greater or equal than 0.98. The remaining 1.5 % sites are the ones displaying some variability in the distribution of nucleotides

