3.1 RNA and raw sequencing data quality statistics
Out of the 120 samples for which RNA was extracted, 86 had a RIN value (a measure of RNA integrity) equal or above 8.8. Little variation in RIN scores was observed among the sampled tissues and sampling methods (Supporting Information Table S1), except for liver, which overall showed higher levels of RNA degradation and was therefore not used for library construction and sequencing. Mean and standard deviation for RIN values for the four tissues were: 9.6±0.22 (blood), 9.2±0.40 (muscle), 8.0±1.21 (liver); 9.0±1 (gill). Mean and standard deviation for RIN values for the three treatment groups without the liver were: 9.2±0.43 (dip netting), 9.3±0.34 (electrofishing), and 9.2±1.06 (tissue harvesting after 5 minutes). We found no differences in RIN values among groups (F = 0.299, df = 2, p = 0.74) and in RIN values among tissues within each group (F = 0.595, df = 4, p = 0.67), after excluding the liver from the analyses.
RNA sequencing from 3’ Tag-Seq samples regardless of tissue type yielded a total of 367.2 million reads for individuals captured by net (mean = 13.1 million + 0.72; N = 28), 328.2 million reads for samples collected by immediately after electrofishing (mean = 12.62 million+ 0.46; N = 26), and 347.6 million reads from samples electrofished and processed after 5 minutes (mean = 12.87 million+ 0.71; N = 27) (Supporting Information Table S1). The final number of reads per individual ranged from 11 million to 15.6 million (mean = 12.88 million ± 0.67). On average, of the 11 million reads randomly selected for each sample, we obtained around 77% of uniquely mapped reads on the rainbow trout (O. mykiss) genome independently of the sampling method used (range: 67.7 ‐ 86.3%, Supporting Information Table S1), indicating that we used good libraries (Dobin and Gingeras, 2015) for downstream analyses.
RNA sequencing from the 14 whole mRNA-Seq (NEB) samples (blood only) yielded a total of 564 million reads for individuals captured by net (mean = 112.9 million + 13.95; N = 5), 563.4 million reads for samples collected by electrofishing and sampled immediately (mean = 112.7 million + 22.4; N = 5), and 350.4 million reads from electrofishing samples processed after 5 minutes (mean = 87.6 million+ 7.4; N = 4). The final number of reads per individual ranged from 77.8 to 148.8 million reads (mean = 105.6 million ± 19.1). Number of reads per sample was therefore on average 10 times higher for NEB than 3’ Tag-Seq.
After mapping the randomly selected 11 or 40 million reads of 3’ Tag-Seq and NEB (see Materials and Methods) on the reference genome of O. mykiss , each 3’ Tag-Seq and NEB sample had >8 and >28 million reads, respectively, to be used for the analyses of gene expression (Supporting Information Table S1). Reads that were uniquely mapped on the O. mykiss genome were similar among all the groups compared in this study (see % mapping per group above and in Supporting Information Table S1), suggesting that >10 million reads, the two RNA-Seq library constructions (3’ Tag-Seq and NEB) uniquely map to roughly the same percentage of the reference genome, even if for the whole mRNA-Seq data we used 40 million reads instead of the 11 million reads used for 3’ Tag-Seq.
Raw reads – i.e., before selecting 11M reads for 3’ Tag-Seq and 40M reads for whole mRNA-Seq – mapped on the rainbow trout (O. mykiss) genome recovered a different number of genes between the two RNA library sequencing types, independently on the number of reads mapped per gene. Specifically, whole mRNA-Seq recovers two to three times more genes than 3’ Tag-Seq (Supporting Information Table S1). Differential expression analysis (see below) for the 14 blood samples for which RNA libraries were built using for 3’ Tag-Seq and mRNA-Seq indicates that presence/absence of genes between the two techniques is independent of gene transcript length (Supporting Information Figure S1).