Evidence for local adaptation in each population
Though FST is low across most of the genome in each population, there are several genomic regions with elevated FST. In addition to the entirety of Muller element B, there are narrow chimneys of high FST on Muller elements D and E (Figure 4, Supplementary Figure 2). We attempted to identify whether any gene ontology groups have significantly higher FST than the rest of the genome. We find that the genes in the upper 2.5th percentile for FSTare enriched for antifungal genes in all populations, these genes are distributed across the genome and so not all under one peak of elevated FST (Supplementary Table 3, GO enrichment = 16.414,p -value = 1.61e-10). Interestingly, this is the only immune category with elevated FST (Figure 2F), with most of the immune system showing no divergence between populations (Figure 2F, Supplementary Figure 4). This might suggest that most pathogen pressures are consistent among populations except for fungal pathogen pressure which may be more variable.
Another gene ontology category with significantly higher FST is cuticle genes (Figure 2E, Supplementary Table 3, GO enrichment = 5.03, p -value = 8.68e-08), which could be associated with differences in the environment between locations (toxin exposure, humidity, etc .). Consistent with this result, the peak of FST on Muller element D (Figure 5, Muller element D, 11.56-11.58Mb) is composed of exclusively cuticle development proteins (e.g. Cpr65Au, Cpr65Av, Lcp65Ad) with elevated FST in these genes in both the SR and HU populations as well as PR (Figure 5), suggesting that they may be adapting to differing local conditions in those populations.
Two other clear peaks on Muller element E are also composed related genes. Interestingly, there appear to be three regions of the D. innubila genome with translocated mitochondrial genes (Figure 5). The first peak (Muller element E, 11.35-11.4Mb) is composed exclusively of one of these translocated mitochondrial regions with 3 mitochondrial genes (including cytochrome oxidase II). The second peak (Muller element E, 23.60-23.62Mb) contains four other mitochondrial genes (including cytochrome oxidase III and ND5) as well as genes associated with nervous system activity (such as Obp93a and Obp99c ). We find no correlation between coverage of these regions and mitochondrial copy number (Supplementary Table 1, Pearsons’ correlation t-value = 0.065,p -value = 0.861), so this elevated FST is probably not an artefact of mis-mapping reads. However, we do find these regions have elevated copy number compared to the rest of the genome (Supplementary Figure 5, GLM t-value = 9.245, p -value = 3.081e-20), and so this elevated divergence may be due to collapsed paralogs. These insertions of mtDNA are also found in D. falleniand are diverged from the mitochondrial genome, suggesting ancient transpositions. The nuclear insertions of mitochondrial genes are also enriched in the 97.5th percentile for FST in HU and PR, when looking at only autosomal genes (Supplementary Table 3, GO enrichment = 4.53, p -value = 3.67e-04). Additionally, several other energy metabolism categories are in the upper 97.5th percentile in CH. Overall these results suggests a potential divergence in the metabolic needs of each population, and that several mitochondrial genes may have found a new function in the D. innubila genome and may be diverging due to differences in local conditions. Alternatively, given the male-killingWolbachia parasitizing D. innubila (Dyer 2004), it is possible the mitochondrial translocations contain functional copies of mitochondrial genes that can efficiently respond to selection unlike their still mtDNA-linked paralogs.
There has been considerable discussion over the last several years about the influence of demographic processes and background selection on inference of local adaptation (Cutter and Payseur 2013; Cruickshank and Hahn 2014; Hoban et al. 2016; Matthey-Doret and Whitlock 2018). In contrast to FST which is a relative measure of population differentiation, DXY is an absolute measure that may be less sensitive to other population-level processes (Nei 1987; Cruickshank and Hahn 2014). In our data, windows with peaks of elevated FST also have peaks of DXY in all pairwise comparisons (Figure 4, Supplementary Figure 6), and FST and DXY are significantly correlated (GLM R2 = 0.823, t-value = 11.371, p -value = 6.33e-30), consistent with local adaptation. The upper 97.5th percentile for DXY is enriched for chorion proteins in all pairwise comparisons and antifungal proteins for all comparisons involving PR (Supplementary Table 5). We also find peaks of elevated pairwise diversity exclusively on the mitochondrial translocations (Supplementary Figure 6), suggesting unaccounted for variation in these genes which is consistent with duplications detected in these genes (Rastogi and Liberles 2005). This supports the possibility that unaccounted for duplications may be causing the elevated FST, DXY and pairwise diversity (Supplementary Figure 5 & 6). We find no evidence for duplications in the antifungal, cuticle or chorion proteins, suggesting the elevated FST and DXY is likely due to local adaptation (Figure 4, Supplementary Figure 5).
Recent adaptation often leaves a signature of a selective sweep with reduced polymorphism near the site of the selected variant. We attempted to identify selective sweeps in each population using Sweepfinder2 (Huber et al. 2016). There was no evidence of selective sweeps overlapping with genes with elevated FST(Supplementary Figure 7A, χ2 test for overlap of 97.5th percentile windows χ2 = 1.33p -value = 0 .249) but there was one extreme peak in PR on Muller D (Supplementary Figure 7B, Muller D, 18.75-19Mb). This peak was also found in all other populations though not on the same scale. The center of this peak is just upstream of the cuticle protein Cpr66D , in keeping with the suggestion of local adaptation of the cuticle in all populations, with the strongest signal in PR. This sweep is also upstream of four chorion proteins (Cp15, Cp16, Cp18, Cp19 ) and covers (within 10kbp of the sweep center) several cell organization proteins (Zasp66, Pex7, hairy, Prm, Fhos ). These chorion proteins also have significantly elevated DXY compared to other genes within 50kbp (Wilcoxon Rank Sum W = 45637000, p -value = 0.0158) and are under a chimney of elevated DXY in all comparisons (Supplementary Figure 6 & 7), consistent with recent selection of population specific variants. We also find evidence of several selective sweeps in the telomere of the X chromosome (Muller A, 39.5-40.5Mb), among several uncharacterized genes. Given the suppression of recombination in the heterochromatic portions of chromosomes, we would expect evidence for several selective sweeps for even weakly positively selected variants, as is also seen in the non-recombining Muller F (Supplementary Figure 7).
Figure 4: Comparison of estimated statistics across theD. innubila genome for the Prescott (PR) population. Values are as follows: the average pairwise divergence per gene (DXY), the population fixation index per genes (FST), within population pairwise diversity per genes, Compositive Likelihood Ratio (CLR) per SNP calculated using Sweepfinder2 and within population average Tajima’s D per gene.