Accounting for reciprocal local ancestry deviations

One concern with our analysis is that deviations in local ancestry might result from reciprocal shared ancestry. That is, a non-local donor individual might share haplotypes with a recipient population because they themselves are admixed from the recipient group. We will observe such reciprocal copying in our framework when a single, or small number of, donor individuals contribute excessively to an ancestry deviation signal. An example can be seen in Figure \ref{fig:fig3}. Across all Gambian populations from West Africa (of which we show the Wollof as an example) we observed a significant increase in East African ancestry across the 32.2-33Mb region of chromosome 6, which contains several HLA genes [Fig. \ref{fig:fig3}a and \ref{fig:fig3}b]. However, this signal is driven almost entirely by Wollof individuals copying from a single Nilo-Saharan speaking individual, Anuak11. When we painted this individual with both non-local and local donors (i.e. including individuals from the Nilo-Saharan ancestry region), across this region we observe that one haplotype from this individual copies a roughly 1Mb chunk of their genome from the West African ancestry region that includes Gambian populations [dark blue in Fig. \ref{fig:fig3}c] even though there were donors from the more closely related Nilo-Saharan ancestry region available. Our analysis of admixture in the Anuak showed an admixture event involving Central West African and Nilo-Saharan sources in 703CE (95% CI 427CE-1037CE) [Fig. \ref{fig:admOverview}], which may have provided a vehicle for West African ancestry to enter this population.
One way to quantify this effect is to look at the number of unique donor haplotypes from an ancestry region contributing to a signal as a proportion of the total number of donors. This quantity will be 1 if every recipient individual copies from a different donor haplotype and will tend towards zero as the number of unique donors copied from goes down. Whilst for Nilo-Saharan ancestry in the Wollof the median proportion of unique donors contributing to a signal genome-wide is 0.342 (95%CI = 0.20-0.55) [Fig. \ref{fig:fig3}e], the average across the 32.2-33.Mb region of chromosome 6 is 0.072, which lies significantly outside of the genome-wide distribution (empirical outlying P value = 0.0007). We can see that this signal is driven by a low proportion of unique donors from the Nilo-Saharan ancestry region [purple line in Fig. \ref{fig:fig3}f] and a large increase in copying from the most copied haplotype [Fig. \ref{fig:fig3}g]. Because this effect may be the result of the donor individual being admixed from the recipient population, in such cases, it is not possible to infer the directionality of ancestry sharing. Therefore, to guard against reciprocal copying we preformed an additional filter on the results. At each locus in the genome, we computed the number of unique donor haplotypes that contributed to the total amount of copying from that region. If the number of unique donors was \(\leq\) 3 or the proportion of unique donors was \(<\) 0.1, we assumed the result was due to reciprocal copying.