Skip to main content

Table 1 Duplicate sequence percentages in HomFam protein families

From: Instability in progressive multiple sequence alignment algorithms

Protein family

Total seqs

Unique seqs

% Dup

aadh

3119

2348

24.72

aat

25,090

19,879

20.77

Acetyltransf

46,279

31,943

30.98

ace

3983

3787

4.92

adh

21,326

15,452

27.54

aldosered

13,270

10,787

18.71

Ald_Xan_dh_2

2583

2037

21.14

annexin

3133

2288

26.97

asp

3249

2979

8.31

az

1057

892

15.61

biotin_lipoyl

11,826

7332

38.00

blmb

17,194

13,102

23.80

blm

9097

7145

21.46

bowman

494

218

55.87

cah

1374

1197

12.88

ChtBD

769

447

41.87

cryst

1153

909

21.16

cyclo

6282

4967

20.93

cys

4303

3910

9.13

cyt3

379

347

8.44

cytb

3200

2622

18.06

DEATH

1176

874

25.68

DMRL_synthase

2094

1423

32.04

egf

7762

5405

30.36

flav

4606

3103

32.63

GEL

2190

1583

27.72

ghf10

1497

1393

6.95

ghf11

516

461

10.66

ghf13

12,597

9870

21.65

ghf1

4350

3471

20.21

ghf22

748

608

18.72

ghf5

2711

2355

13.13

glob

3942

2828

28.26

gluts

10,085

7841

22.25

gpdh

7683

4993

35.01

hip

162

115

29.01

hla

13,460

9148

32.03

HLH

6776

3417

49.57

HMG_box

4774

2988

37.41

hom

12,029

6044

49.75

hormone_rec

3504

2896

17.35

hpr

3344

1878

43.84

hr

3702

1985

46.38

icd

5673

4505

20.59

il8

1062

799

24.76

ins

787

524

33.42

int

7567

6185

18.26

KAS

2064

1490

27.81

kringle

1082

821

24.12

kunitz

2256

1753

22.30

ldh

7353

3094

57.92

LIM

6423

3729

41.94

ltn

1056

909

13.92

lyase_1

7627

5611

26.43

mmp

1421

1136

20.06

mofe

2561

2326

9.18

msb

4876

4094

16.04

myb_DNA-binding

10,393

7124

31.45

OTCace

4790

3234

32.48

oxidored_q6

3343

1974

40.95

p450

21,001

19,700

6.19

PDZ

14,944

9552

36.08

peroxidase

4509

3589

20.40

phc

2945

1961

33.41

phoslip

928

803

13.47

profilin

682

579

15.10

proteasome

5715

4549

20.40

Rhodanese

14,043

10,011

28.71

rhv

17,970

9151

49.08

ricin

740

548

25.94

rnasemam

492

438

10.98

rrm

27,590

18,692

32.25

rub

1430

975

31.82

rvp

93,675

64,987

30.62

scorptoxin

355

311

12.39

sdr

50,144

40,212

19.81

seatoxin

88

63

28.41

serpin

3136

2957

5.71

slectin

927

749

19.20

sodcu

2031

1586

21.91

sodfe

4447

2728

38.65

Stap_Strp_toxin

634

174

72.56

sti

608

536

11.84

subt

7506

6469

13.81

Sulfotransfer

2484

2269

8.65

tgfb

1598

1022

36.04

tim

3894

2909

25.30

tms

2113

1518

28.16

TNF

551

417

24.32

toxin

488

450

7.79

trfl

830

742

10.60

tRNA-synt_2b

11,288

7670

32.05

uce

4545

3744

17.62

zf-CCHH

88,330

45,901

48.03

  1. The list of HomFam protein families, the total number of sequences in each family, the number of unique sequences, and the percentage of the total number of sequences that are duplicates