loading page

DNA Barcoding and geographical scale effect: the problems of undersampling genetic diversity hotspots
  • +3
  • Álvaro Gaytán,
  • Johannes Bergsten,
  • Tara Canelo,
  • Carlos Pérez-Izquierdo,
  • Maria Santoro,
  • Raul Bonal
Álvaro Gaytán
Stockholms Universitet
Author Profile
Johannes Bergsten
Naturhistoriska riksmuseet
Author Profile
Tara Canelo
Universidad de Extremadura - Campus de Plasencia
Author Profile
Carlos Pérez-Izquierdo
Universidad de Extremadura - Campus de Plasencia
Author Profile
Maria Santoro
Instituto de Investigacion en Recursos Cinegeticos
Author Profile
Raul Bonal
Universidad de Extremadura - Campus de Plasencia
Author Profile

Peer review status:UNDER REVIEW

14 Mar 2020Submitted to Molecular Ecology Resources
31 Mar 2020Reviewer(s) Assigned
08 Jul 2020Review(s) Completed, Editorial Evaluation Pending


DNA barcoding identification needs a good characterization of intra-specific genetic divergence to establish the limits between species. Yet, the number of barcodes per species is many times low and geographically restricted. A poor coverage of the species distribution range may hamper identification, especially when undersampled areas host genetically distinct lineages. If so, the genetic distance between some query sequences and reference barcodes may exceed the maximum intra-specific threshold for unequivocal species assignation. Taking a group of Quercus herbivores (moths) in Europe as model system, we found that the number of DNA barcodes from southern Europe is proportionally very low in the Barcoding of Life Data Systems (BOLD). This geographical bias complicates the identification of southern query sequences, due to their high intra-specific genetic distance with respect to barcodes from higher latitudes. Pairwise intra-specific genetic divergence increased along with spatial distance, but was higher when at least one of the sampling sites was in southern Europe. Accordingly, GMYC (General Mixed Yule Coalescent) single threshold model retrieved clusters constituted exclusively by Iberian haplotypes, some of which could correspond to cryptic species. The number of putative species retrieved was more reliable than that of multiple threshold GMYC but very similar to results from ABGD and jMOTU. Our results support GMYC as a key resource for species delimitation within poorly inventoried biogeographic regions in Europe, where historical factors (e.g. glaciations) have promoted genetic diversity and singularity. Future European DNA barcoding initiatives should be preferentially performed along latitudinal gradients, with special focus on southern peninsulas.