Skip to main content

Table 1 Original dataset and datasets after threshold removal

From: HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing

Dataset

Number

Minimum length

Maximum length

Average length

File size

Φ DNA (1 ×)

672

16,556

16,579

16,569.7

10 MB

Φ DNA (100 ×)

67,200

As above

As above

As above

1.1 GB

Φ DNA (1000 ×)

672,000

As above

As above

As above

11 GB

Φ RNA (small)

108,453

807

1599

1442.8

156 MB

Φ RNA (large)

1,011,621

807

1629

1388.5

1.4 GB

Φ Protein (1 ×)

17,892 (218 families)

19

4895

459.0

15 MB

Φ Protein (100 ×)

1,789,200 (218 families)

As above

As above

As above

1.5 GB

Φ Protein (1000 ×)

17,892,000 (218 families)

As above

As above

As above

15 GB