Transcript abundance estimation
We received 1.45 billion 100-bp paired-end reads that passed the HiSeq
quality filter, averaging 14.5 million reads per sample. We used
Rcorrector to amend Illumina sequencing errors (Song & Florea, 2015),
and removed adapter sequences and trimmed reads for high-quality
sequence using Trim Galore! (Babraham Bioinformatics, Babraham
Institute). Following developer recommendations, we used a quality score
of 33, a stringency of 5, and a minimum read length of 36bp. We aligned
corrected, trimmed reads from both datasets to the guppy genome
(http://uswest.ensembl.org/Poecilia_reticulata; downloaded November
2019) and estimated transcript abundance using STAR (Dobin et al., 2013)
with default parameters. On average, 91% (range: 88.8% – 94.2%) of
sequences per individual mapped to the guppy genome assembly.