Skip to main content

Table 4 Motifs identified with use of phylogenetic information. Listing of motifs and details of their host sequences for phylogenetic motif finding. All datasets tested are from [9]. DNA region details the DNA regions considered (PR signifies promoter region). # Seq. gives the number of input sequences. Motif (id) identifies the consensus sequence of the discovered motif and its correspondence with the motifs of [9] where applicable. All listed motifs have been documented as regulatory elements in TRANSFAC [45]. For datasets other than the insulin dataset, only the best motif is reported and for the insulin dataset multiple motifs are reported in order of discovery.

From: A combinatorial optimization approach for diverse motif finding applications

DNA region

# Seq.

Motif (id)

Growth-horm. 5' UTR + PR (380 bp)

16

TATAAAAA (7)

Histone H1 5' UTR + PR (650 bp)

4

AAACAAAAGT (2)

C-fos 5' UTR + PR (800 bp)

6

CCATATTAGG

C-fos first intron (376 to 758 bp)

7

AGGGATATTT (3)

Interleukin-3 5' UTR + PR (490 bp)

6

TGGAGGTTCC (3)

C-myc second intron (971 to 1376 bp)

6

TTTGCAGCTA (5)

C-myc 5' PR (1000 bp)

7

GCCCCTCCCG

Insulin family 5' PR (500 bp)

8

GCCATCTGCC (2)

TAAGACTCTA (1)

CTATAAAGCC (3)

CAGGGAAATG (4)