2.2 DNA extraction and genome size estimation
High-quality DNA was extracted from fresh muscle tissues using DNeasy Blood & Tissue Kits (Qiagen, Halden, Germany). The genome size of C. undulatus was estimated based on Illumina DNA sequencing technology, as performed in a previous study (Xiao et al. 2019). In brief, DNA was randomly sheared to 300–500 bp fragments using Covaris 2000, purified, end-repaired, and amplified using PCR. The constructed DNA library was sequenced using the Illumina NovaSeq 6000 platform in 150 PE mode (Illumina Inc., San Diego, CA, USA). After removal of low-quality and redundant reads, the clean reads were obtained for de novo assembly to estimate the genome size. All clean reads were subjected to 17-mer frequency distribution analysis. We obtained a k-mer frequency distribution for C. undulatus (Fig. S1). The heterozygosity of the genome was not significantly different from the k-mer distribution of C. undulatus at the half-expected depth site (Fig. S1). Therefore, we did not perform heterozygosity analysis in the next step. Genome size was calculated using the formula with amendment: G = Nk -mer_num/ Dk_mer_depth, where G is the genome size, Nk-mer_num is the number ofk -mers, and D is the k -mer expected depth, as described (Xiao et al. 2019).