2.2 DNA extraction and genome size estimation
High-quality DNA was extracted from
fresh muscle tissues using DNeasy Blood & Tissue Kits (Qiagen, Halden,
Germany). The genome size of C. undulatus was estimated based on
Illumina DNA sequencing technology, as performed in a previous study
(Xiao et al. 2019). In brief, DNA was randomly sheared to
300–500 bp fragments using Covaris 2000, purified, end-repaired, and
amplified using PCR. The constructed DNA library was sequenced using the
Illumina NovaSeq 6000 platform in 150 PE mode (Illumina Inc., San Diego,
CA, USA). After removal of low-quality and redundant reads, the clean
reads were obtained for de novo assembly to estimate the genome
size. All clean reads were subjected to 17-mer frequency distribution
analysis. We obtained a k-mer frequency distribution for C.
undulatus (Fig. S1). The heterozygosity of the genome was not
significantly different from the k-mer distribution of C.
undulatus at the half-expected depth site (Fig. S1). Therefore, we did
not perform heterozygosity analysis in the next step. Genome size was
calculated using the formula with amendment: G =
Nk -mer_num/ Dk_mer_depth,
where G is the genome size, Nk-mer_num is the number ofk -mers, and D is the k -mer expected depth, as described
(Xiao et al. 2019).