2.5 ∣ Identification of contracted and expanded gene families
Gene families were constructed through a hierarchical clustering algorithm and ‘all against all’ BLASTP. The alignments with high-scoring segment pairs (HSPs) were conjoined for each gene pair by solar. To identify homologous gene-pairs, more than 30% coverage of the aligned regions in both homologous genes was required. We determined the expansion and contraction of orthologous gene families by comparing the cluster size differences between the ancestor and other species using the CAFÉ (Version 1.6) program. A random death model was used to study changes of gene families along each lineage of phylogenetic tree. A probabilistic graphical model (PGM) was introduced to the probability of transitions in gene family size from parent to child nod phylogeny.