2.7 Read mapping and counting
As a reference for gene expression quantification, we created and annotated a de novo transcriptome assembly of I. pygmaeus CNS and eye tissues using long read PacBio ISO-sequencing data. Refer to Supplementary Text S2 for a description of the methods for ISO-sequencing, de novo transcriptome assembly, and transcriptome annotation. The trimmed and decontaminated RNA-seq reads were mapped against the transcriptome assembly using salmon (v1.3.0) (Patro et al., 2017). Correction for sequence-specific biases and fragment-level GC biases was used, the quantification step was skipped, and the flags ‘–validateMappings’ and ‘–hardFilter’ were also used. Corset (v1.09) (Davidson & Oshlack, 2014) was run on the salmon equivalence class files from all 40 samples to cluster the transcripts to gene-level and produce gene-level counts. In Corset, we provided the four groups/treatments (eyes current-day CO2, eyes elevated CO2, CNS current-day CO2 and CNS elevated CO2), the log likelihood ratio test was switched off to prevent differentially expressed transcripts being split into different clusters, and the links between contigs were removed if the link was supported by less than 10 reads.