Genotyping SNPs
To generate the SNP dataset, we followed the PHYLUCE v.1.5.0 pipeline with default parameters (Faircloth, 2016) for sequence trimming,de novo assembly of contigs, identification of UCE loci, and sequence alignment. We generated a pseudo-genomic reference by aligning each locus with MAFFT v7.407 and trimming using Gblocks v0.91b with default parameters (Castresana, 2000). We then used Geneious v9.1.2 to generate a consensus sequence for each locus, replacing ambiguity codes with an appropriate nucleotide at random. We used Picard v1.106 (http://broadinstitute.github.io/picard/), and SAMtools v1.9 (Li et al., 2009) to create sequence dictionaries and reference indices from the reference. We used the PHYLUCE script snps.py to automate alignment of trimmed reads from each sample to the reference with BWA-MEM v0.7.17, and then called SNPs with the HaplotypeCaller tool of the Genome Analysis Toolkit v3.7 (McKenna et al., 2010) following Giarla and Esselstyn (2015). Using VCFtools v0.1.14 (Danecek et al., 2011), we removed SNPs that failed to pass GATK quality filters (QD < 2.0 || FS > 60.0 || MQ < 40.0 || HaplotypeScore > 13.0 || MappingQualityRankSum < -12.5 || ReadPosRankSum < -8.0), and selected SNPs with a minimum depth of coverage of 8 per individual and a minor allele frequency ≥ 5%. We used HD_plot.py (McKinney, Waples, Seeb, & Seeb, 2017) to filter SNPs resulting from putative paralogs or wrongly assembled contigs from the dataset by removing SNPs with heterozygosity > 0.75 and a read-ratio deviation score D >10. The D statistic is a measure of deviation from the expected allelic read ratio of 1:1 when reads are summed over all heterozygous individuals. This method more accurately identifies true SNP loci than methods relying on read depth or heterozygote excess alone (McKinney et al., 2017). After filtering with HD_plot.py , we removed SNPs that were out of Hardy-Weinberg Equilibrium after Bonferroni correction for multiple comparisons (p < 10-5) using VCFtools v0.1.16. To generate a set of unlinked SNPs, we selected a single SNP per UCE using VCFtools v0.1.16 ‘-thin 2000’. We used the unlinked SNP dataset with 10% missing data for all SNP-based analyses except calculation of genetic diversity and effective population size, Principal Components Analysis (PCA), Discriminant Analysis of Principal Components (DAPC) and Bayesian cluster analysis with STRUCTURE v2.3.4 (Pritchard, Stephens, & Donnelly, 2000), for which we used a dataset with no missing data.