2.10 Gene family identification
The predicted proteomes in the C. undulatus genome and those from other genomes of 13 teleost fishes, including ballan wrasse (L. bergylta ), corkwing wrasse (S. melops ), nile tilapia (Oreochromis niloticus ), clownfish (Amphiprion percula ), zebrafish (D. rerio ), cave fish (Sinocyclocheilus anshuiensis ), tongue sole (Cynoglossus semilaevis ), stickleback (G. aculeatus ), medaka (Oryzias latipes ), mudskipper (Boleophthalmus pectinirostris ), spotted seabass (L. maculates ), pufferfish (T. rubripes ), and ghost shark (Callorhinchus milii ), were filtered to obtain the longest script per gene, subjected to an all-vs-all Blastp (E-value ≤1e-5), and then clustered to identify gene family using OrthoMCL (Li et al.2003), with the inflation index set at 1.5 to find orthologs. In the predicted gene repertoires of the compared genomes, orthologs that could not be found were ascribed to species-specific genes.