Gene flow and phylogeny
To detect gene flow and possible migration events between louse populations, we used TREEMIX (Pickrell & Pritchard, 2012) to build a population tree and add on likely migration events. TREEMIX can model gene flow between populations and its direction by comparing the covariance modeled by the bifurcating tree to the observed covariance between populations. We ran the TREEMIX analysis with m=1 to m=14 migration events with four replicates for each migration edge (m) with varying SNP window sizes. We assessed the variance explained by the models, standard error and likelihood scores using the Evanno method (Evanno et al., 2005) implemented in the optM R package (Fitak, 2021). Delta “m” was calculated for each run and the optimum number of migration edges that fit the data were chosen accordingly. A summary of the percentage of variance explained by each model, significance of the migration events, and the associated likelihood scores can be found in Figure S3.
Finally, to evaluate the evolutionary relationships among the lice from the countries in our dataset, we constructed a neighbor-joining tree that uses Euclidean genetic distances for each population using the R package dartR(Gruber et al., 2018). To root the phylogenetic tree with an outgroup and evaluate ancestral populations, we identified SNPs from 16 chimpanzee lice (Pediculus schaeffi ) using the body louse genome as the reference (Kirkness et al., 2010) and following the same bioinformatic pipeline as outlined in Figure 1. The resulting dataset underwent the same post-process filtering steps as the head lice (see methods above). The final filtered SNPs from both species were intersected using bcftools -isec to identify variants that were common to both species. The final dataset for the phylogenetic analysis including the outgroup contained 889 variants. The resulting neighbor-joining tree using these 889 SNPs was imported into Figtree for visualization.