Population structure
Population structure analyses were performed separately for each
species. To obtain a set of “neutral markers” for each species for
these analyses, we first removed from each dataset the loci identified
as candidates for adaptation by the pRDA. Next, we identified loci that
deviated from Hardy-Weinberg Equilibrium (HWE) proportions for all
samples combined using the function –hardy from VCFtools, and we
removed loci with p < 0.05 from the dataset.
We first estimated isolation-by-distance (IBD) using the mantelfunction from the vegan R package (Oksanen et al. 2011)
for both the neutral and adaptive datasets. We used the functiondist.genpop from the adegenet R package (Jombart & Ahmed
2011) to calculate Nei’s genetic distances (Nei 1972, 1978) between all
pairs of populations, and plotted those genetic distances against
pairwise Euclidean geographic distances (average geographic location of
each population). We performed this for both NIDGS and SIDGS datasets
and for neutral and adaptive loci separately.
We performed PCA to obtain a visual inspection of population structure
without underlying models of evolution. We used the snpgdsPCAfunction from the SNPRelate R package (Zheng et al. 2012)
to perform the PCA for the complete dataset (all loci), the neutral
dataset, and the adaptive dataset of each species. We used theggplot function from the ggplot 2 R package to construct
the plots (Kassambara 2018). For the neutral data, we also evaluated
population structure for each species using an approach with an
underlying Bayesian model to estimate the number of populations
(K ), using the program STRUCTURE (Pritchard et al.2000) and the ParallelStructure R package (Besnier & Glover
2013). We applied an admixture model run for 1x106generations, and five replicates per K , with K = 1 toK = 13 in NIDGS and to K = 5 in SIDGS, which correspond to
the total number of populations sampled for each species. We then
determined the optimal number of populations (K ) according to the
Evanno method and the rate of change in the likelihood values (Evannoet al. 2005).
We estimated interspecific population differentiation using the IDGS
total dataset, and intraspecific population differentiation for NIDGS
and SIDGS ‘neutral’ and ‘adaptive’ datasets, using the functionsgenet.dist and boot.ppfst from the R packagehierfstat to calculate genetic divergence
(F ST) with 999 bootstraps between all geographic
sites (Goudet 2005). We then built correlation matrices of pairwiseF ST between all populations using the functioncorrplot from the corrplot R package (Wei et al.2017). Using the same datasets, we determined genetic diversity by
estimating observed and expected heterozygosity
(H O and H E, respectively)
per population. To estimate H O andH E, we used the function basic.stats from
the hierfstat R package (Goudet 2005), averaging locus
heterozygosity per population. We then performed a paired Wilcoxon test
to identify significant differences between H Oand H E for both neutral and adaptive loci among
populations of each species, between HE of both neutral
and adaptive loci among populations of each species, and
HE between neutral and adaptive datasets for each
species, using the function t.test from the stats R
package.