Population-specific Summary Statistics
To better understand the genetic diversity and demographics of the worldwide head louse populations, we first calculated summary statistics. We grouped our individuals by continent and calculated the nucleotide diversity (pi) and Tajima’s D values from SNPs across the entire nuclear genome using the sliding window approach using the R package PopGenome (Pfeifer et al., 2014). We then calculated SNP heterozygosity for each country (observed;Ho and expected heterozygosity;He), to identify populations and loci that significantly deviated from Hardy Weinberg Equilibrium (HWE). By grouping individuals into their respective countries, we considered a spatial based population substructure to account for the Wahlund effect (Wahlund,1928) that would otherwise have an impact on the countrywide HWE calculations. All locus-by-locus HWE analyses were done using Arlequin 3.5 (Excoffier & Lischer, 2010) with 1,000,000 steps in Markov chain after 100,000 steps discarded as burnin. We also calculated statistical significance of the deviations at p <0.05 and proportion of loci that significantly deviated from HWE for each population. In addition to SNP heterozygosity, we also calculated autosomal heterozygosity for each country which indicates the proportion of heterozygous sites in each country. SNP heterozygosity and autosomal heterozygosity differ in that both polymorphic and monomorphic sites are included in autosomal heterozygosity calculations. Polymorphic sites are those which at least two individuals in each population had in common. We eliminated the monomorphic sites for our HWE analyses because they provided no useful information about the evolutionary forces acting upon the population.
To examine the distribution of genetic variability within and among groups and populations, we ran an Analysis of Molecular Variance (AMOVA) by using Arlequin 3.5 (Excoffier & Lischer, 2010). Hierarchical groups were defined as follows: samples were first grouped into their respective populations (by countries) and then they were further grouped into the five geographic continents (North America, South America, Asia, Europe, Africa, and Oceania. The significance of the variance components was tested using 1000 permutations. Lastly, we calculated the pairwise genetic differentiation (FST) between all pairs within and between countries and continents. The analysis was run in Arlequin, and significant genetic differentiation was determined after 1000 permutations.