Sampling and genotyping
We analyzed a total of 304 Idaho ground squirrel samples. We conducted all sampling following standard biological and ethical requirements under the auspices of University of Idaho Animal Care and Use Committee Protocol #2015-53. We preserved samples consisting of buccal swabs in Qiagen ATL buffer and stored them at room temperature or 4oC until DNA extraction. For NIDGS, we obtained 232 samples from 17 populations in Adams County in central Idaho (Figure 1B), which represents approximately 60% of the known populations. For SIDGS, we collected 72 samples from 5 populations in Washington, Payette and Gem Counties (Figure 1B), which is only a small proportion of the > 100 sites identified during surveys in 2000 (CCAA 2002). Details on the location and number of samples collected per population are shown in Table 1. Samples were extracted using the Qiagen Blood and Tissue DNA extraction kit (Qiagen, Inc.), and quantified DNA using fluorometric quantitation with a Qubit™ double-stranded DNA High Sensitivity Assay Kit. We used genetic samples for Restriction Site Associated DNA Sequencing (RADseq) (Murphy et al. 2007; Bairdet al. 2008; Andrews et al. 2016). We prepared libraries following Ali et al. (2016) using the sbfI restriction enzyme, but excluding the last part of that protocol relating to targeted bait capture, and instead only using the new RADseq protocol with biotinylated adapters. We built four libraries, each comprised of ~80 individually barcoded samples, and sequenced each library on one lane of Illumina HiSeq 4000 at the University of Oregon Genomics & Cell Characterization Core Facility (GC3F), with 150 base pairs (bp) paired-end reads. We conducted all bioinformatic analyses using the Institute for Bioinformatics and Evolutionary Studies (IBEST) Computational Resources Core servers at the University of Idaho. For SNP calling, we used the software stacks version 2.2 (Catchenet al. 2013) to demultiplex and remove PCR duplicates . We used the thirteen-lined ground squirrel (Ictidomys tridecemlineatus ) genome (GCA_000236235.1) as a reference to align our data using the software bwa version 0.7.17 (Li 2013). From these alignments, we used the software SAMtools version 1.9 (Li et al. 2009). We conducted our analyses at both the interspecific and intraspecific levels in order to determine putative adaptive differences between the two sister species, but also searching for patterns of population structure and local adaptation among populations within species. To do so, we defined a first dataset consisting of all 304 IDGS, and then separated the samples by species, ‘NIDGS’ (232 samples) and ‘SIDGS’ (72 samples) datasets. We called SNPs for the ‘IDGS’ dataset using the multisample SNP caller (mpileup ) implemented in SAMtools , and subsequently used VCFTools version 0.1.14 (Danecek et al.2011) to exclude individuals with ≥50% missing data, keeping only biallelic SNPs that had <50% missing data, were located >10,000 base pairs apart, had >2% minor allele frequency and >3 reads. To produce the species specific datasets, we separated the samples by species (232 NIDGS and 72 SIDGS) and repeated the above filtering with a MAF value of 3%. We then used part of the R script available from Wright et al. (2019) to exclude loci with heterozygosity >70% and with greater than a 3:1 ratio of mean read depth for the reference allele versus the alternative allele, across individuals with these alleles, to avoid loci with significant reference bias. Finally, we used the functions from thewhoa R package to estimate heterozygote miscalls rates and excluded loci deviating from more than one standard deviation from the mean (http://150.185.130.98/rcran/web/packages/whoa/index.html). For all datasets, we compared the percentage of missing data per individual (–missing-indv) with mean read depth (–depth) and heterozygosity (–het) obtained using VCFTools version 0.1.14 (Daneceket al. 2011).