Genome sequencing, assembly and annotation
One living individual of T. polyphylla was collected from the Chongdugou scenic spot in Henan, China (111°39’41.64’ ‘E, 33°56’23.87 ‘’ N) for whole genome sequencing. We sequenced and assembled the genome using a combination of Illumina short-read sequencing and Nanopore long-read sequencing. The completeness of the genome assembly was assessed with sets of both the Core Eukaryotic Genes Mapping Approach(CEGMA; Parra et al., 2007) and benchmarking universal single-copy orthologs (BUSCO; Simao et al., 2015) . For repetitive element annotation, simple sequence repeats (SSRs), tandem repeats and transposable elements (TEs) were identified in the T. polyphylla genome. We combined de novo , homology-based, and RNA sequencing-aided methods for gene prediction. For details, see Supporting Information Methods S1.
Hi-C library construction and chromosome assembly
To generate a chromosome-level assembly of the T. polyphyllagenome, a Hi-C library was constructed following Rao’s protocol (Rao et al., 2014 ). Fresh leaf cells were fixed in 1% formaldehyde for cross-linking. The cross-linked DNA was homogenized by tissue lysis, digested with DpnII restriction endonuclease, labelled with biotin-14-dCTP, and ligated using T4 DNA Ligase. After reversal of the cross-links, the ligated DNA was purified and sheared into 300–600 bp fragments. Biotinylated DNA fragments were extracted using streptavidin beads to construct the Hi-C fragment library. After PCR enrichment, high-quality libraries were sequenced on an Illumina NovaSeq 6000 platform to produce approximately 160.46 Gb data.
The cleaned Hi-C data were mapped to the initial genome assembly using BOWTIE2 v2.3.2 (Langmead & Salzberg, 2012) with the end-to-end model (-very-sensitive -L 30), and only unique mapped read pairs were retained in further analysis. Then, the valid mate pair reads were used for chromosome-level genome assembly, and the contigs of the draft genome were sorted, oriented, and divided into different chromosomal groups using the LACHESIS pipeline (Burton et al., 2013) with the following parameters: CLUSTER MIN RE SITES = 100, CLUSTER MAX LINK DENSITY = 2.5, CLUSTER NONINFORMATIVE RATIO = 1.4, ORDER MIN N RES IN TRUNK = 60, and ORDER MIN RES IN SHREDS = 60.