Comparison of linkage groups to SNP cohorts identified in previous work
Using the draft female genome assembly, Trevoy et al. (2019) conducted principal component analyses and found that plateaus of high-loading SNPs in linkage disequilibrium (LD) from a number of scaffolds were driving clustering patterns on the first four principal component (PC) axes; plateaus in PCs 1 and 3 were primarily related to geography, PC 2 was sex-linked, and PC 4 was much smaller and not clearly attributed to geography or sex. To determine physical linkage and the chromosomal locations of these SNPs, we assessed the correspondence between the scaffolds from the draft female genome containing SNPs with the highest loadings up to and including the plateaus in each PC shown in Trevoy et al. (2019) and the final female assembly using BLAST+ v2.10.0 (Camacho et al., 2009). For PCs 1 and 2, we included draft scaffolds containing SNPs that had loadings equal to or greater than 0.05; for PC 3 this cut-off was 0.087, and for PC 4, it was 0.1. We created a custom BLAST database out of the final female assembly, and then used BLASTn to query the draft scaffolds for each PC against the new assembly, specifying a minimum e-value of 10-5. For each PC, hits were sorted first based on e-value and then bitscore, outputting the single best match to the final assembly for each draft assembly scaffold.