Skip to main content
Figure 3 | Algorithms for Molecular Biology

Figure 3

From: Jaccard index based similarity measure to compare transcription factor binding site models

Figure 3

The similarities (depending on P-value) and LOGO representations for pairs of TFBS models (HOCOMOCO and JASPAR) for selected TFs. It is notable that even for extremely similar LOGOs, like those of CTCF, the Jaccard similarity reaches only 0.6, indicating that the models define the sets of binding sites overlapping only for 60%. The similarity remains comparatively low even at high P-values (e.g. 0.01 where each 100th word of the dictionary is recognized as the binding site). The same effect is shown for KLF4 (with the exception of similarity 1.0 for the lowest P-value, where both models recognize only identical consensus sequences). SPI1 models differing in length show very weak similarities. HIF1A models are surprisingly dissimilar at low P-values (possibly due to shorter model lengths).

Back to article page