Modeling migration
We used the program MIGRATE-N v3.6 with its Bayesian implementation (Beerli, 2005) to compare support for six different models of population structure and migration (Figure 2). We used the phased pseudo-haplotype sequence dataset for this analysis to take advantage of the higher information content in DNA sequences relative to SNPs. Although our workflow may generate chimeric sequences in instances where phase is unresolved, we do not expect that this would affect model selection (Andermann et al., 2018; Beerli pers. comm. ). To ensure that phase did not affect model inference, we ran the MIGRATE-N analysis twice with different configurations of variants within haplotypes while maintaining all other settings.
We compared the following models: 1) panmixia, 2) four populations (high elevation MK, low elevation MK, low elevation MT and high elevation MT) with bidirectional migration between adjacent pairs, 3) three populations (high elevation MT, low elevation MT and all of MK) with bidirectional migration between adjacent pairs, 4) three populations with migration between all pairs, 5) two populations (high elevation MT separate from all others) with bidirectional migration, and 6) two populations with unidirectional migration from high elevation MT (Figure 2).
For model 2, we assigned individuals to populations based on their sampling location: low elevation individuals < 2000 masl, and high elevation ≥ 2000 masl. For models 3, 4, 5, and 6, we assigned individuals based on STRUCTURE output for K = 3 and K = 2, respectively. We randomly selected five individuals from each population cluster (n = 10 haplotypes). We did not include all individuals because for coalescent processes, increasing the sample size above this does not necessarily improve accuracy, but substantially increases computation time (Felsenstein, 2005). For each model, we ran two long chains of 20,000,000 steps, sampled every 100 steps with 50,000 steps per chain discarded as burn-in, and with four heated chains. To ensure comparability across models, we ran the most complex model first and used the same prior distributions and run parameters for all subsequent models. We assessed chain mixing through acceptance ratios and ESS of parameters and genealogies. We calculated log Bayes factors (LBF) and model probability using the Bezier approximation of the marginal model likelihood and the formula described in (Beerli & Palczewski, 2010).