Method/Database | Time used in seconds | Sensitivity | Precision | Specificity | MCC1 |
---|
PROSITE
| - | 85.717 | 93.043 | 99.996 | 0.857 |
RISOTTO
| 18.635 | 47.003 | 99.957 | 100 | 0.470 |
Pratt
| 1598.3 | 81.507 | 94.159 | 99.995 | 0.815 |
Teiresias
| 0.908 | 76.798 | 0.2523 | 41.163 | 0.030 |
WildSpan (Family-based)
| 89.782 | 99.042 | 97.481 | 99.993 | 0.990 |
- The table shows the performance of family-based mining of WildSpan on protein family classification based on PA10F. The results were compared to PROSITE annotated patterns and three other pattern mining methods: RISOTTO, Teiresias, and Pratt. The input data was prepared by collecting proteins in the release 50.9 of UniProtKB/Swiss-Prot (235673 entries), and the discovered patterns were verified through all protein sequences in the release 2010/08 of UniProtKB/Swiss-Port (518415 entries). Fragment and partially matches were excluded in both training and testing data. The parameter values of all the methods were set as default
- 1 Matthews correlation coefficient (MCC): (TP×TN - FP×FN)/SQRT( (TP+FP) × (TP+FN) × (FN+FP) × (TN+FN) )