Skip to main content

Advertisement

Table 1 Detailed description of the ten gene families of the mammalian dataset

From: Aligning coding sequences with frameshift extension penalties

Gene family Human gene # of genes # of CDS Length \(\frac{N*(N-1)}{2}\)
I (FAM86) ENSG00000118894 6 14 10335 91
II (HBG017385) ENSG00000143867 6 10 8988 45
III (HBG020791) ENSG00000179526 6 10 11070 45
IV (HBG004532) ENSG00000173020 17 33 52356 528
V (HBG016641) ENSG00000147041 13 33 64950 528
VI (HBG014779) ENSG00000233803 28 44 45813 946
VII (HBG012748) ENSG00000134545 24 44 28050 946
VIII (HBG015928) ENSG00000178287 5 19 5496 171
IX (HBG004374) ENSG00000140519 13 30 36405 435
X (HBG000122) ENSG00000105717 11 24 27081 276
Total number of pairs of CDS 4011
  1. For each gene family, the family identifier used in [6] or [12], the Ensembl identifier of a human gene member of the family, the number of human, mouse and cow genes in the family, the total number of CDS of these genes, the total sum of lengths of these CDS and the number of distinct pairs of CDS are given