1 Introduction
Adaptation of coral reef fish plays an important role in sustaining marine ecological environments. The family Labridae presents a unique opportunity to gain insight into adaptation. Species of Labridae originated in the late Cretaceous to early Paleogene periods (Alfaro et al. 2018) and quickly diversified into over 519 species in 71 genera with an extensive variety of inter- and intra- specific color, morphs, body shapes, and feeding behavior to adapt to various reef environments (Liu D et al.2019). In the feeding apparatus, the paired pharyngeal bones are united into a single jawbone, which is derived from a pair of gill arch bones, whereas the other widespread fishes display left and right separated pharyngeal bones (Cowman et al. 2009). The united pharyngeal bones allowed labrid fish to generate a great bite force for efficient capture of prey (Wainwright et al. 2012). The earliest Labridae fossils demonstrate this pharyngeal apparatus (Bannikov & Sorbini 1990). Among labrid fishes, the humphead wrasse, Cheilinus undulatus Rüppell, is an endangered species found on coral reefs and inshore habitats and is distributed in much of the tropical Indo-Pacific Ocean. Moreover, it is one of the most valued and high-priced fish (Russell 2004). C. undulatus , one of the few predators of sea hares, boxfish, and starfish, controls excess reproduction of such toxic animals in coral reef environments, maintaining the stability of reef ecology (Sadovy 1998). Therefore, international trade has been limited to conserve this species (Sadovy et al. 2003). This species has been listed as a vulnerable species in the IUCN 1996 Red Data Book and a threatened species in the IUCN 2001 Red List (Donaldson & Sadovy 2001).C. undulatus is characterized by several prominent features, including a large hump on the forehead of adult individuals, large fleshy lips, and a pair of distinctive lines running through the eyes. Body color varies at different developmental stages. C. undulatusis the largest member of the family Labridae, with a maximum size of 2.3 m in length and over 190 kg in weight (Graham et al. 2015).
C. undulatus adults inhabit steep outer reef slopes and benthal at 2-60 m, whereas juveniles are typically found in shallower waters adjacent to coral reefs (Sadovy et al. 2003). Little is known about the mechanism underlying the habitat change related to its diet, probably because the whole genome is unknown to date, the genetic architecture could not be provided, and there may be associations with genes coding for visual, olfactory, and feeding parameters. In morphological evolution, the specialized pharyngeal jaw apparatus functions chiefly to collect, manipulate, and transport food into the esophagus. Meanwhile, visual sensitivity could be useful for fish to detect potential prey through the water column. Therefore, C. undulatus must have co-evolved a set of visual adaptations for food gathering; however, this remains to be answered. A particularly widespread and well-studied example of this adaptation is the expression of opsin genes. For example, in rainbow trout, the short-wavelength sensitive 1 (SWS1 ) gene may be nonfunctional in adults, but functions in juveniles for foraging zooplankton, which is an important developmental factor (Cheng & Flamarique 2007). Interestingly, diverse expression of opsin genes provides alternative mechanisms for feeding ecology of Labrid fish (Phillips et al. 2016). Opsins in fish are keys to the successful colonization of habitats, ranging from the dark deep sea to clear mountain streams (Cortesi et al. 2015).
Fish possess five opsins composed of a monophyletic gene family, including one rhodopsin (Rh1 ), SWS1 , SWS2 , one middle-wavelength sensitive (Rh2 ), and one long-wavelength sensitive (LWS ) opsin gene, with a total of five subfamilies that are sensitive to dim vision, ultraviolet, blue, green, and red wavelengths, respectively (Collin et al. 2003). Synteny analysis of opsin genes indicated that a local duplication produced LWSand SWS ; subsequently, two rounds of whole-genome duplication expanded visual opsin into five subfamilies in early vertebrates (Lagmanet al. 2014). A five-gene repertoire of opsin can be found in the lamprey (Geotria australis ) without jaws, suggesting that the opsin gene is the ancestral state in jaws (Davies et al. 2007). The majority of ray-finned fishes display several copies within each opsin subfamily due to tandem duplications or whole-genome duplication events (Cortesi et al. 2015; Rennison et al. 2012).LWS and SWS2 duplications in Cyprininae and Rh2duplication in salmonids were regarded as a consequence of tetraploidy (Lin et al. 2017). Tandem duplication is a major contributor toLWS subfamily amplification (Rennison et al. 2012). However, little is known about the molecular mechanism of opsin tandem duplication. Opsin duplicates could be divergent or display loss of function. Color sensitivity may have been restored through gene duplications (Sharkey et al. 2017) or inactivation of one opsin, resulting in retinal monochromacy (Springer et al. 2016). The duplicates of opsin gains and losses are believed to correlate with the evolutionary adaptation of fish under different living environments (Linet al. 2017). Opsin gene repertoires in deep-water fish differ from those living closer to the surface, and LWS genes are lost in some deep-water species (Rennison et al. 2012). Such events dictate whether fish are successful in catching prey or escaping from predators (Phillips et al. 2016).
C. undulatus is an ideal candidate for the investigation of opsin evolution in coral reef fish based on the visual system and the united pharyngeal bones. However, a genome with chromosomal assembly ofC. undulatus has not been reported. To our knowledge, the mitochondrial genome (Qi et al. 2013) and a few transcriptomes (Liu H et al. 2019) have been reported for humphead wrasse. From an evolutionary perspective, genomic resources of C. undulatusprovide insight into the mechanism of the visual system for food foraging. In this study, we present the first genome assembly at the chromosomal level for endangered humphead wrasse using Illumina short reads, Nanopore long-read DNA sequencing platform, Hi-C technologies, and a genome assembly strategy. In comparison with other known fish genomes, we found that C. undulatus has five LWS1genes, four SWS2 genes, and five Rh2 genes, the most reported number of any fish yet. The multiple genes were initially produced via whole-genome duplication, subsequently expanded by gene conversion, while transposons contributed to opsin gene conversion. PAML analyses showed positive selection sites in Rh2 genes. RNA sequencing (RNA-seq) analyses showed variation in opsin expression. Our results indicate that the sudden increase in opsin copies may play an important role in prey strategy, behavior ecology, sexual change, and evolution of this species. We believe that the annotated draft genome assembly will serve as a resource for future studies of ecology and conservation of the humphead wrasse.