Skip to main content

Advertisement

Table 5 P-values for several DNA patterns (known transcription factors are marked with a star) in the upstream region data set.

From: Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data

DNA pattern n L homogeneous heterogeneous
CGCACCC* 28 10 2.95 × 10-3 3.74 × 10-3
AAGAAAAA* 427 11 1.31 × 10-99 1.29 × 10-99
AACAACAAC 25 10 1.76 × 10-6 1.38 × 10-6
TCCGTGGA* 22 11 1.12 × 10-6 1.55 × 10-6
GCGCGCGC 18 11 6.52 × 10-10 1.65 × 10-9
RTAAAYAA* 391 14 7.70 × 10-12 1.68 × 10-12
WWWTTTGCTCR* 15 17 4.15 × 10-1 4.09 × 10-1
AAAAAAAAAAAAAAAAAAAAAAAA 42 27 2.05 × 10-23 2.14 × 10-22
TAWWWWTAGM* 212 36 3.08 × 10-9 3.04 × 10-9
YCCNYTNRRCCGN* 11 40 3.10 × 10-2 3.05 × 10-2
GCGCNNNNNNGCGC 1 106 8.97 × 10-1 8.84 × 10-1
CGGNNNNNNNNCGG* 102 183 1.26 × 10-14 1.73 × 10-13
GCGCNNNNNNNNNNGCGC 6 464 2.88 × 10-2 2.84 × 10-2