Skip to main content

Table 1 Descriptions of protein datasets. # Seq. gives the number of input protein sequences. Length gives the length of the protein motif searched for. |V| gives the number of vertices in the original graph constructed from the dataset. DEE gives the methods used to prune the graph, and are denoted by (1) clique-bounds DEE, (2) tighter constrained bounds and (3) graph decomposition. |V DEE | is the number of vertices in the graph after pruning. E-value lists the e-value of the motif found by the LP/DEE algorithm.

From: A combinatorial optimization approach for diverse motif finding applications

Dataset

# Seq.

Length

|V|

DEE

|V DEE |

E-value

Lipocalin

5

16

844

(1)

5

3.80 × 10-16

Helix-Turn-Helix

30

20

6870

(1,2,3)

260

3.88 × 10-67

Tumor Necrosis Factor

10

17

2329

(1)

10

1.50 × 10-40

Zinc Metallopeptidase

10

12

7761

(1,2)

10

5.82 × 10-23

Immunoglobulin Fold

18

10

7498

(1,2,3)

187

3.04 × 10-24