Signatures of local adaptive divergence across D. innubila populations
We downloaded gene ontology groups from Flybase (Gramateset al. 2017). We then used a gene enrichment analysis to identify enrichments for particular gene categories among genes in the 97.5th percentile and 2.5thpercentile for FST, Tajima’s D and Pairwise Diversity versus all other genes (Subramanian et al. 2005). Due to differences on the chromosomes Muller A and B versus other chromosomes in some cases, we also repeated this analysis chromosome by chromosome, taking the upper 97.5th percentile of each chromosome.
We next attempted to look for selective sweeps in each population using Sweepfinder2 (Huber et al. 2016). We reformatted the polarized VCF file to a folded allele frequency file, showing allele counts for each base. We then used Sweepfinder2 on the total called polymorphism in each population to detect selective sweeps in 1kbp windows (Huber et al. 2016). We reformatted the results and looked for genes neighboring or overlapping with regions where selective sweeps have occurred with a high confidence, shown as peaks above the genomic background. We surveyed for peaks by identifying 1kbp windows in the 97.5th percentile for composite likelihood ratio per chromosome.
Using the total VCF with outgroup information, we next calculated Dxy per SNP for all pairwise population comparisons (Nei and Miller 1990), as well as within population pairwise diversity and dS from the outgroups, using a custom python script. We then found the average Dxy and dS per gene and looked for gene enrichments in the upper 97.5th percentile, versus all other genes.