Skip to main content

Table 1 Duplicate sequence percentages in HomFam protein families

From: Instability in progressive multiple sequence alignment algorithms

Protein family Total seqs Unique seqs % Dup
aadh 3119 2348 24.72
aat 25,090 19,879 20.77
Acetyltransf 46,279 31,943 30.98
ace 3983 3787 4.92
adh 21,326 15,452 27.54
aldosered 13,270 10,787 18.71
Ald_Xan_dh_2 2583 2037 21.14
annexin 3133 2288 26.97
asp 3249 2979 8.31
az 1057 892 15.61
biotin_lipoyl 11,826 7332 38.00
blmb 17,194 13,102 23.80
blm 9097 7145 21.46
bowman 494 218 55.87
cah 1374 1197 12.88
ChtBD 769 447 41.87
cryst 1153 909 21.16
cyclo 6282 4967 20.93
cys 4303 3910 9.13
cyt3 379 347 8.44
cytb 3200 2622 18.06
DEATH 1176 874 25.68
DMRL_synthase 2094 1423 32.04
egf 7762 5405 30.36
flav 4606 3103 32.63
GEL 2190 1583 27.72
ghf10 1497 1393 6.95
ghf11 516 461 10.66
ghf13 12,597 9870 21.65
ghf1 4350 3471 20.21
ghf22 748 608 18.72
ghf5 2711 2355 13.13
glob 3942 2828 28.26
gluts 10,085 7841 22.25
gpdh 7683 4993 35.01
hip 162 115 29.01
hla 13,460 9148 32.03
HLH 6776 3417 49.57
HMG_box 4774 2988 37.41
hom 12,029 6044 49.75
hormone_rec 3504 2896 17.35
hpr 3344 1878 43.84
hr 3702 1985 46.38
icd 5673 4505 20.59
il8 1062 799 24.76
ins 787 524 33.42
int 7567 6185 18.26
KAS 2064 1490 27.81
kringle 1082 821 24.12
kunitz 2256 1753 22.30
ldh 7353 3094 57.92
LIM 6423 3729 41.94
ltn 1056 909 13.92
lyase_1 7627 5611 26.43
mmp 1421 1136 20.06
mofe 2561 2326 9.18
msb 4876 4094 16.04
myb_DNA-binding 10,393 7124 31.45
OTCace 4790 3234 32.48
oxidored_q6 3343 1974 40.95
p450 21,001 19,700 6.19
PDZ 14,944 9552 36.08
peroxidase 4509 3589 20.40
phc 2945 1961 33.41
phoslip 928 803 13.47
profilin 682 579 15.10
proteasome 5715 4549 20.40
Rhodanese 14,043 10,011 28.71
rhv 17,970 9151 49.08
ricin 740 548 25.94
rnasemam 492 438 10.98
rrm 27,590 18,692 32.25
rub 1430 975 31.82
rvp 93,675 64,987 30.62
scorptoxin 355 311 12.39
sdr 50,144 40,212 19.81
seatoxin 88 63 28.41
serpin 3136 2957 5.71
slectin 927 749 19.20
sodcu 2031 1586 21.91
sodfe 4447 2728 38.65
Stap_Strp_toxin 634 174 72.56
sti 608 536 11.84
subt 7506 6469 13.81
Sulfotransfer 2484 2269 8.65
tgfb 1598 1022 36.04
tim 3894 2909 25.30
tms 2113 1518 28.16
TNF 551 417 24.32
toxin 488 450 7.79
trfl 830 742 10.60
tRNA-synt_2b 11,288 7670 32.05
uce 4545 3744 17.62
zf-CCHH 88,330 45,901 48.03
  1. The list of HomFam protein families, the total number of sequences in each family, the number of unique sequences, and the percentage of the total number of sequences that are duplicates