3.1 ∣ Genomic sequences
We sequenced the genome of a female WWS using a whole-genome shotgun
strategy. Cleaned-up reads provided 107-fold average coverage across
88.23 Gb of assembled sequence (Supplementary Fig. 3 and Supplementary
Tables 1 and 2), with N50 values of 411.09 kb and 1.25 Mb respectively,
for contigs and scaffolds in the final assembly (generated using
SOAPdenovo4; Supplementary Figs. 4-5 and Supplementary Tables 3-5). We
identified a total of 14,020 protein-coding genes (82.7% of which were
functionally classified; Figure 1a, Supplementary Figs. 6-7 and
Supplementary Tables 6-8). The WWS assembly was further refined by using
high throughput chromosome conformation capture (Hi-C) data, comprising
9 scaffolds with scaffold N50 of 69.68Mb (Table 1). As a result, 98.16%
contigs assembly of 638.30 Mb were distributed across 9 chromosome-level
pseudomolecules (Fig. 2a, b; Table 1 and Supplementary Table 9).
When we compared WWS with other insect species, we found 4615 gene
families in common and 497 gene families unique to WWS (Figure 1a).
Using these protein-coding genes, we generated a timescale for insect
evolution (Figure 1b). Comparative analyses were carried out on gene
clusters among WWS and other representative insect species along with
theDaphnia
pulex (water flea, a crustacean) as a non-insect outgroup (Figure 1b,
Supplementary Table 10). A total of 24,923 gene families were identified
among these species with 553 clusters as common singlets (Supplementary
Fig. 7). Mobile elements comprised about 37.3% repeat content of the
WWS genome and the percent of long terminal repeats (LTR) was 28.5%,
much higher than other transposons (DNA, LINE and SINE) (Supplementary
Fig. 8 and Supplementary Tables 11-12). In addition, we identified
noncoding RNA (ncRNA) genes, including 357 miRNA, 188 tRNA, 42 rRNA, and
46 snRNA genes (Supplementary Table 13).