Sample filtering, mapping and alignment
We removed adapter sequences using Scythe (Buffalo 2018),
trimmed all data using cutadapt to remove barcodes (Martin
2011) and removed low quality sequences using Sickle (parameters: -t
sanger -q 20 -l 50) (Joshi and Fass 2011). We masked theD. innubila reference genome, using D. innubila TE
sequences generated previously and RepeatMasker (parameters: -s -gccalc
-gff -lib customLibrary) (Smit and Hubley 2013-2015;
Hill et al. 2019). We then mapped the short reads to the
masked D. innubila genome using BWA MEM (Li and Durbin
2009), and sorted and indexed using SAMTools (Li et al.2009). Following mapping, we added read groups, marked and removed
sequencing and optical duplicates, and realigned around indels in each
mapped BAM file using Picard and GATK
(Http://broadinstitute.github.io/picard ; McKennaet al. 2010; DePristo et al. 2011). We then
removed individuals with low coverage of the D. innubila genome
(less than 5x coverage for 80% of the non-repetitive genome), and
individuals we suspected of being misidentified as D. innubilafollowing collection due to anomalous mapping. This left us with 280D. innubila wild flies (48 - 84 flies per populations) from 2017
and 38 wild flies from 2001 with at least 5x coverage across at least
80% of the euchromatic genome (Supplementary Table 1).