Skip to main content

Table 5 P-values for several DNA patterns (known transcription factors are marked with a star) in the upstream region data set.

From: Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data

DNA pattern

n

L

homogeneous

heterogeneous

CGCACCC*

28

10

2.95 × 10-3

3.74 × 10-3

AAGAAAAA*

427

11

1.31 × 10-99

1.29 × 10-99

AACAACAAC

25

10

1.76 × 10-6

1.38 × 10-6

TCCGTGGA*

22

11

1.12 × 10-6

1.55 × 10-6

GCGCGCGC

18

11

6.52 × 10-10

1.65 × 10-9

RTAAAYAA*

391

14

7.70 × 10-12

1.68 × 10-12

WWWTTTGCTCR*

15

17

4.15 × 10-1

4.09 × 10-1

AAAAAAAAAAAAAAAAAAAAAAAA

42

27

2.05 × 10-23

2.14 × 10-22

TAWWWWTAGM*

212

36

3.08 × 10-9

3.04 × 10-9

YCCNYTNRRCCGN*

11

40

3.10 × 10-2

3.05 × 10-2

GCGCNNNNNNGCGC

1

106

8.97 × 10-1

8.84 × 10-1

CGGNNNNNNNNCGG*

102

183

1.26 × 10-14

1.73 × 10-13

GCGCNNNNNNNNNNGCGC

6

464

2.88 × 10-2

2.84 × 10-2