Population-specific Summary Statistics
To better understand the genetic diversity and demographics of the
worldwide head louse populations, we first calculated summary
statistics. We grouped our individuals by continent and calculated the
nucleotide diversity (pi) and Tajima’s D values from SNPs across the
entire nuclear genome using the sliding window approach using the R
package PopGenome (Pfeifer et al., 2014). We then calculated SNP
heterozygosity for each country (observed;Ho and
expected heterozygosity;He), to identify populations and
loci that significantly deviated from Hardy Weinberg Equilibrium (HWE).
By grouping individuals into their respective countries, we considered a
spatial based population substructure to account for the Wahlund effect
(Wahlund,1928) that would otherwise have an impact on the countrywide
HWE calculations. All locus-by-locus HWE analyses were done using
Arlequin 3.5 (Excoffier & Lischer, 2010) with 1,000,000 steps in Markov
chain after 100,000 steps discarded as burnin. We also calculated
statistical significance of the deviations at p <0.05
and proportion of loci that significantly deviated from HWE for each
population. In addition to SNP heterozygosity, we also calculated
autosomal heterozygosity for each country which indicates the proportion
of heterozygous sites in each country. SNP heterozygosity and autosomal
heterozygosity differ in that both polymorphic and monomorphic sites are
included in autosomal heterozygosity calculations. Polymorphic sites are
those which at least two individuals in each population had in common.
We eliminated the monomorphic sites for our HWE analyses because they
provided no useful information about the evolutionary forces acting upon
the population.
To examine the distribution of genetic variability within and among
groups and populations, we ran an Analysis of Molecular Variance (AMOVA)
by using Arlequin 3.5 (Excoffier & Lischer, 2010). Hierarchical groups
were defined as follows: samples were first grouped into their
respective populations (by countries) and then they were further grouped
into the five geographic continents (North America, South America, Asia,
Europe, Africa, and Oceania. The significance of the variance components
was tested using 1000 permutations. Lastly, we calculated the pairwise
genetic differentiation (FST) between all pairs within
and between countries and continents. The analysis was run in Arlequin,
and significant genetic differentiation was determined after 1000
permutations.