Gene flow and phylogeny
To detect gene flow and possible migration events between louse
populations, we used TREEMIX (Pickrell & Pritchard, 2012) to build a
population tree and add on likely migration events. TREEMIX can model
gene flow between populations and its direction by comparing the
covariance modeled by the bifurcating tree to the observed covariance
between populations. We ran the TREEMIX analysis with m=1 to m=14
migration events with four replicates for each migration edge (m) with
varying SNP window sizes. We assessed the variance explained by the
models, standard error and likelihood scores using the Evanno method
(Evanno et al., 2005) implemented in the optM R package (Fitak,
2021). Delta “m” was calculated for each run and the optimum number of
migration edges that fit the data were chosen accordingly. A summary of
the percentage of variance explained by each model, significance of the
migration events, and the associated likelihood scores can be found in
Figure S3.
Finally, to evaluate the
evolutionary relationships among the lice from the countries in our
dataset, we constructed a neighbor-joining tree that uses Euclidean
genetic distances for each population using the R package dartR(Gruber et al., 2018). To root the phylogenetic tree with an outgroup
and evaluate ancestral populations, we identified SNPs from 16
chimpanzee lice (Pediculus schaeffi ) using the body louse genome
as the reference (Kirkness et al., 2010) and following the same
bioinformatic pipeline as outlined in Figure 1. The resulting dataset
underwent the same post-process filtering steps as the head lice (see
methods above). The final filtered SNPs from both species were
intersected using bcftools -isec to identify variants that were common
to both species. The final dataset for the phylogenetic analysis
including the outgroup contained 889 variants. The resulting
neighbor-joining tree using these 889 SNPs was imported into Figtree for
visualization.