DISCUSSION
Genome size and assembly
completeness
In this study, a high-quality pufferfish genome assembly of high
contiguity was reconstructed, from data obtained from a single MinION
flow cell and half a lane of Illumina HiSeq. To our knowledge, only one
other highly contiguous reference Tetraodontiformes genome assembly has
previously been constructed using the same strategy, forThamnaconus septentrionalis (Bian et al. 2019). The assembly ofL. sceleratus (360 Mb) is comparable in size with that of other
puffers, such as Fugu
rubripes (~365 Mb; Aparicio et al. 2002), T.
flavidus (~377 Mb; Zhou et al. 2019a), T.
bimaculatus (~393.15 Mb; Zhou et al. 2019b), T.
obscurus (~373 Mb; Kang et al. 2020) and T.
nigroviridis (340 Mb, Jaillon et al. 2004). The contig N50 value
(~11 Mb) of the L. sceleratus assembly is
considerably greater than that reported for the genomes of T.
bimaculatus (1.31 Mb; Zhou et al. 2019b) and T. flavidus (4.4
Mb; Zhou et al. 2019a). Similarly, our assembly appears of equivalent
levels of completeness to other Tetraodontidae genomes, based on BUSCO
scores (e.g., T. obscurus [Kang et al. (2020)] and T.
flavidus [Zhou et al. (2019a)]).
Repeat content, gene
prediction and functional annotation
The percentage of transposable elements (TEs) found in the L.
sceleratus genome (16.55 % of the assembled genome) is marginally
higher than the one found in T. septentrionalis (14.2%) (Bian et
al., 2019), T. obscurus (11.05%) (Kang et al., 2020), andM. mola (11%) (Pan et al. 2016). Moreover, it is almost twofold
higher that in T. rubripes (7.53%) and threefold higher thanT. nigroviridis (5.60%) and T. flavidus (6.87%) (Gao et
al., 2014). T. rubripes contain more copies of transposable
elements than T. nigroviridis, which have been proposed to
contribute to its marginally larger genome size (365-370 Mb) (Jaillon et
al., 2004). Although the L. sceleratus genome has a comparable
size to reported Takifugu genomes, it harbors much higher repeat
content. Moreover, D. holocanthus genome of the Diodontidae
family contains 36.35% repetitive sequences, almost double the repeat
content of L. sceleratus . These findings imply that TEs might
follow an independent pathway of accumulation and diversification across
Tetraodontiformes species. In the case of L. sceleratus , such
differential repeat expansion may have taken place after the divergence
of the Takifugu and Tetraodon genera.
Despite such TE content variation across closely related taxa, positive
correlation of genome size and TE repeat content has been documented
across a larger evolutionary scale in teleosts (Shao et al., 2019). For
example, the relatively smaller genome of T. nigroviridis(~360 Mb) contains 5.6% TEs, in contrast to the
zebrafish genome (~1.4 Gb) which is composed of 55%
repetitive sequences (Shao et al., 2019). This positive correlation is
also reflected in the small size and relatively low repeat content of
the L. sceleratus genome, regardless of differences with other
pufferfish. However, it would be interesting to further explore these
differences, as they may be informative for genome evolution. As an
interesting example, LINE elements are the most abundant in the L.
sceleratus genome, with ~170,000 copies, as compared to
the ~12,300 copies of the T. rubripes genome.
This finding indicates dynamic genome evolution in the two species.
Previous studies have shown a correlation between genome TEs and species
adaptations to new environments, suggesting they may be associated to
invasiveness (Yuan et al. 2018, Stapley et al., 2015). Thus, the repeat
content of L. sceleratus may play a role in its fast adaptation
to novel environments and should be investigated further.
Species tree reconstruction
Although the order Tetraodontiformes is a cosmopolitan taxonomic group
that includes multiple families, large parts of their phylogenetic
relationships remain unexplored. In this study, we presented the first
phylogenetic tree based on whole genome data including the invasive
“sprinter” L. sceleratus . The recovered phylogenetic position
of L. sceleratus is within Tetraodontidae and is placed closer toT. nigroviridis , while the long branch length of the
Tetraodontidae clade possibly suggests a faster evolutionary rate.
Regarding relationships within the pufferfish group (T.
nigroviridis, T. rubripes, T. flavidus, T. bimaculatus andL. sceleratus ), the resulting topology agrees with previous
studies (Hughes et al., 2020, Hughes et al., 2018, Meynard et al., 2012,
Yamanoue et al., 2009). Moreover, the Tetraodontidae group was recovered
confidently as monophyletic in accordance with Yamanoue et al. (2011).
Our results suggest that Tetraodontiformes are the closest group to
Sparidae and corroborates the results of Natsidis et al. (2019) and of
others (Kawahara et al., 2008; Meynard et al., 2012), based both on six
mitochondrial and two nuclear genes.
Synteny analysis
All pairwise comparisons of the whole-genome alignment analysis ofL. sceleratus against the four other Tetraodontidae species
(Figures 5) (Figure S9-S11), showed highly conserved synteny. The genome
that exhibited the highest synteny conservation with the L.
sceleratus genome was that of T. nigroviridis, in accordance
with our reconstructed phylogeny which places the two species as more
closely related to each other compared to the rest.
The synteny between L. sceleratus and the three species of the
genus Takifugu (T. rubripes , T. bimaculatus andT. flavidus ) was less conserved, especially between L.
sceleratus and T. bimaculatus .
To sum up, the higher synteny between L. sceleratus and T.
nigroviridis corroborates their closer phylogenetic position compared
to the three Takifugu species.
Gene family evolution and
adaptation
Adapting to a new habitat is a challenging task for a species, requiring
a certain degree of physiological plasticity. To achieve establishment
in a new niche, an invader must face environmental challenges that
involve both biotic and abiotic factors (Crowl et. al., 2008). Invasive
species are facing novel pathogens during the colonisation of new
environments and the ability to deal with these new immune challenges is
key to their invasive success (Lee and Klasing, 2004). Interestingly, we
found several expanded immune related families, includingimmunoglobulins (C-Type and V-Type) , Ig heavy chain
Mem5-like , B-cell receptors and the Fish-specific NACHT
associated domain, which are related to the innate immunity (Stein et
al., 2007).
In addition, we also detected major histocompatibility complex
(MHC) class I genes in the expanded gene families. MHC genes are
crucial for the immune response, involved in pathogen recognition by T
cells (Germain, 1994), thus initiating the adaptive immune response. The
expanded repertoire of L. sceleratus immune response associated
genes might be related to its survival in novel habitats, through the
detection and inhibition of a wide range pathogens. Therefore, in this
context, we suggest further research to explore the role of the expanded
genes related to immune response.
Another interesting finding was the expansion of the fucosyltransferase
(FUT) gene family. In particular, we detected 24 FUT9 (alpha (1,3)
fucosyltransferase 9) genes. Glycosylation is one of the most frequent
post-translational modifications of a protein. Many proteins involved in
the immune response are glycosylated, extending their diversity and
functionality (Bednarska et al., 2017). Fucosylation, a type of
glycosylation, plays an essential role in cell proliferation, metastasis
and immune escape (Jia et al., 2018). In mice, FucTC has been shown to
regulate leukocyte trafficking between blood and the lymphatic system,
after its engagement in selectin ligand biosynthesis (Maly et al.,
1996).
Overall, based on our results, we may hypothesize that the rapidly
expanded innate immune system gene families identified play a role in
the ability of L. sceleratus spread rapidly throughout the
Eastern Mediterranean (Kalogirou 2011).