Diversity and Population Structure
In this study, we focused on worldwide populations of head lice to understand their genetic diversity and distribution, which may help shed light on major events of their human host dispersal throughout the recent past. We investigated the worldwide population structure of human head lice using whole genome nuclear SNPs. Results of multiple clustering analyses agreed with an overall population structure that showed five genetically distinct nuclear clusters. In all our analyses that evaluated population structure, sub-Saharan African individuals separate out from the rest of the individuals when considering only two populations (Africa and the remainder of the world) up to as many as five genetic populations (Figure 4). This African cluster is also the most genetically diverse possessing a greater number of polymorphic sites compared to other non-African countries. Through comparison to an outgroup (the chimpanzee louse) we were able to confirm that this sub-Saharan African population is sister and basal to the rest of the non-African individuals (Figure 7).These results are consistent with Yong et al., 2003 who also found a clear geographic separation between African and non-African lice using partial nuclear genes 18S rRNA and EF1-alpha. Human genetics and history present a similar pattern, with a great deal of genetic diversity being found in sub-Saharan African populations (Campbell & Tishkoff, 2008). Some of these characteristics include highest nucleotide diversity, highest observed heterozygosity, and high percentage variation separating them out from all other populations. In human population genetics, all of these are characteristics of Africa being the source population to all modern-day humans (J. Z. Li et al., 2008; Xing et al., 2010).
Upon closer investigation of the five nuclear genetic clusters that we uncovered in human head lice, we detected some global patterns arising that are similar to its host. In Asia + Oceania, lice split among two genetic clusters consisting of Southeast Asia and South Asian individuals (Figure 4). This geographical split between South and Southeast Asia is consistent with the southern expansion route proposed for human dispersals into Asia (Macaulay et al., 2005; Reyes-Centeno et al., 2014; Tassi et al., 2015). Southeast Asia further divided into four geographically structured sub-clusters comprised of a Thailand + Laos + Cambodia genetic cluster, and China, Papua New Guinea, and Philippines each forming separate genetic clusters (Figure S7), reflecting the fine scale population structure of humans in that area (Henn et al., 2010). In contrast, the South Asian genetic cluster includes lice from locations that are geographically separated from it (i.e., Mongolia, Hungary, and Egypt). The genetic affinity of the samples from Hungary and Egypt could be the result of sampling from recent immigrants or travelers (in both cases, the samples were obtained from a single individual). However, our TREEMIX analysis showed gene flow between Hungary and Egypt with a relatively high migration weight which was highly significant (p<10-308). This gene flow event may explain the similarity between Hungary and Egypt. While the genetic affinity among these groups may be reflective of louse demographic history, further research is needed to better understand these relationships. Furthermore, due to the limited number of sampling in Hungary and Egypt, these samples may not necessarily reflect the genetic structure found across these countries and any interpretations taken from a sample derived from a single host should be done with caution.
The samples from Europe and the Americas showed little differentiation, grouping together in the PCA, DPAC and fastSTRUCTURE analyses. Nuclear diversity, heterozygosity, and FST values were similarly low between the Europe, North America, and South American samples, which is unexpected given the host population structure and dispersal history. In humans, African populations have the highest genetic diversity, followed by Europeans and Asians, with the lowest genetic diversity in indigenous American populations (Rosenberg et al., 2002). However, one key difference in host and parasite diversity patterns is the sampling strategies. In this study, lice were not collected from isolated ethnic groups or from aboriginal Americans like in human genetic diversity studies. Therefore, genetic similarity between European and American louse populations could be due to more recent gene flow (e.g., during European colonization of the Americas). Alternatively, the low genetic diversity and similarity of European and American (North and South America) lice could be due to selective pressures from insecticide use. The high use of pyrethroid insecticides to control louse infestations in Europe and the Americas (Diamantis et al., 2009) may have reduced the genetic variation among these populations but further investigation is needed, and is currently underway, to test this alternative explanation. In addition, it could also be the case that our lice were sampled from European descendants in cities outside of Europe (i.e in the Americas) . At K=5 clusters, a subset of the continental North American samples, primarily from Central America, separate out from the Europe + Americas cluster. The populations in this cluster also had moderate levels of observed heterozygosity (0.05-0.08), greater than in European and other American populations. It could be that these Central American louse populations may have experienced less insecticide exposure or the genetic variation may reflect earlier louse demographic history that could not be observed in the other American louse populations. For example, our TREEMIX analysis using allele frequency distributions shows a potential ancestral gene flow event between Philippines and the countries in this Central American cluster (Figure 8), suggesting inter-continental mixing at some point in time.
The geographically structured genetic clusters we uncovered here are concordant with previous findings examining microsatellite markers from eight localities around the world (Ascunce et al., 2013). Our current dataset adds information about African populations and how the genetic diversity of human head lice is distributed across the world by ancestral demographic events. Based on our analyses of population substructure, it is evident that the 5 major nuclear clusters that we uncovered are further subdivided into major regions within continents suggesting even more genetically structured louse populations (Figure S5-S9).