Cesar Fortes-Lima

and 4 more

Admixture is a fundamental evolutionary process that has influenced genetic patterns in numerous species. Maximum-likelihood approaches based on allele frequencies and linkage-disequilibrium have been extensively used to infer admixture processes from genome-wide datasets, mostly in human populations. Nevertheless, complex admixture histories, beyond one or two pulses of admixture, remain methodologically challenging to reconstruct. We develop an Approximate Bayesian Computation (ABC) framework to reconstruct highly complex admixture histories from independent genetic markers. We built the software package MetHis to simulate independent SNPs or microsatellites in a two-way admixed population for scenarios with multiple admixture pulses, monotonically decreasing or increasing recurring admixture, or combinations of these scenarios; and draw model-parameter values from prior distributions set by the user. For each simulation, MetHis calculates 24 summary-statistics describing genetic diversity and moments of individual admixture fractions. We coupled MetHis with existing machine-learning ABC algorithms and investigate the admixture history of hybrid populations. Results show that Random-Forest ABC scenario-choice can accurately distinguish most complex admixture scenarios and errors are mainly found in regions of the parameter space where scenarios are highly nested, and, thus, biologically similar. We focus on African American and Barbadian populations as case studies. We find that Neural-Network ABC posterior parameter estimation is accurate and reasonably conservative under complex admixture scenarios. For both admixed populations, we find that monotonically decreasing contributions over time, from Europe and Africa, explain the observed data more accurately than multiple admixture pulses. This approach will allow for reconstructing detailed admixture histories when maximum-likelihood methods are intractable.