Transcriptome Assembly and Transcriptome Assessment
In order to generate a comprehensive transcriptome for N. riversi , we combined all RNAseq data and generated a de novotranscriptome assembly using Trinity v2.10.0 (Haas et al. 2013). We used default settings, but normalized the input reads in silico based on the calculated maximum read coverage. To provide a quantitative assessment of transcriptome completeness, we first assessed the number of full-length transcripts using blastx v2.7.1 (Camachoet al. 2009) to query the UniProt Swiss-Prot database (UniProt Consortium 2019). We then examined alignment scores relative to a set of near-universal single-copy orthologs using the software BUSCO v2.0 (Seppey et al. 2019b). We selected the reference gene set for Endopterygota (OrthoDB v9), which contains 2,442 genes. To further refine the transcriptome assembly, a super-transcriptome was generated by merging the de novo transcriptome from Trinity and the annotated genome assembly (see below) using Necklace v1.11 (Davidson & Oshlack 2018). The goal of this step was to produce a compact, but comprehensive set of transcribed genes that reflect total evidence. The super-transcriptome assembly has been made publicly available on NCBI (TSA Accession: GIWW00000000).