Skip to main content

Table 3 MI-based error of subfamily-identifying methods.

From: An automated stochastic approach to the identification of the protein specificity determinants and functional subfamilies

A. Generated data and LacI

 

SDPsite

bete

Giant component

Protein keys

FASS

S-method

SDPclust

 

Generated family 1

0.58

0.00

1.00

0.00

0.00

0.00

0.00

 

Generated family 2

0.94

1.00

1.00

0.94

0.95

0.94

0.00

 

LacI

0.11

0.12

0.20

0.16

0.93

1.00

0.10

 

B. Enzyme dataset

 

SDPsite

Giant component

Protein keys

FASS

S-method

SDPclust

# EC

# sequences

PF00108

0.000

0.633

0.621

0.687

0.786

0.679

2

22

PF00128

0.801

0.552

0.586

N/A

N/A

0.544

10

154

PF00135

0.429

0.583

0.548

N/A

N/A

0.486

4

129

PF00215

0.759

0.878

0.803

N/A

0.758

0.751

3

92

PF00278

0.321

0.608

0.538

0.311

0.577

0.277

3

55

PF00293

0.239

0.449

0.200

N/A

N/A

0.237

6

205

PF00348

1.000

0.352

0.555

0.674

0.764

0.372

3

16

PF00351

0.292

1.000

0.495

N/A

N/A

0.573

3

6

PF00579

0.492

0.749

0.603

0.629

0.764

0.472

2

41

PF00583

0.132

0.326

0.261

N/A

N/A

0.311

10

244

PF00590

1.000

0.383

0.141

0.603

0.561

0.070

7

22

PF00755

0.544

0.407

0.522

N/A

N/A

0.431

4

22

PF00871

1.000

0.675

0.709

N/A

N/A

0.752

3

12

PF00896

0.000

0.594

0.000

N/A

N/A

0.000

2

13

PF00962

0.000

0.654

0.399

0.285

0.367

0.494

2

17

PF01048

0.000

0.722

0.571

0.912

0.912

0.912

2

16

PF01112

0.000

0.500

0.333

N/A

N/A

0.333

2

7

PF01467

0.756

0.515

0.432

0.543

0.295

0.142

6

67

PF01712

0.500

0.387

0.521

N/A

N/A

0.500

4

14

PF02274

0.000

0.788

0.707

0.000

1.000

0.707

2

32

PF03171

0.668

0.096

0.196

0.812

0.813

0.075

11

153

Overall distance*0.184

0.276

0.204

0.250

0.172

0.165

   
  1. The following Web-servers were used for testing: SDPsite http://bioinf.fbb.msu.ru/SDPsite/index.jsp, bete http://phylogenomics.berkeley.edu/cgi-bin/SCI-PHY/input_SCI-PHY.py, giant component (unavialable, implemented in house), Protein Keys http://www.proteinkeys.org/proteinkeys/, FASS, S-method http://treedet.bioinfo.cnio.es/, SDPclust http://bioinf.fbb.msu.ru/SDPfoxWeb/main.jsp. The best performing method is marked in bold.
  2. * all groups are treated together, as members of the same family.