Skip to main content

Table 1 Descriptions of protein datasets. # Seq. gives the number of input protein sequences. Length gives the length of the protein motif searched for. |V| gives the number of vertices in the original graph constructed from the dataset. DEE gives the methods used to prune the graph, and are denoted by (1) clique-bounds DEE, (2) tighter constrained bounds and (3) graph decomposition. |V DEE | is the number of vertices in the graph after pruning. E-value lists the e-value of the motif found by the LP/DEE algorithm.

From: A combinatorial optimization approach for diverse motif finding applications

Dataset # Seq. Length |V| DEE |V DEE | E-value
Lipocalin 5 16 844 (1) 5 3.80 × 10-16
Helix-Turn-Helix 30 20 6870 (1,2,3) 260 3.88 × 10-67
Tumor Necrosis Factor 10 17 2329 (1) 10 1.50 × 10-40
Zinc Metallopeptidase 10 12 7761 (1,2) 10 5.82 × 10-23
Immunoglobulin Fold 18 10 7498 (1,2,3) 187 3.04 × 10-24