loading page

High-throughput sequencing of 5S-IGS in oaks - exploring intragenomic variation and algorithms to recognize target species in pure and mixed samples.
  • +2
  • Roberta Piredda,
  • Guido Grimm,
  • Ernst-Detlef Schulze,
  • Thomas Denk,
  • Marco Simeone
Roberta Piredda
Stazione Zoologica Anton Dohrn
Author Profile
Guido Grimm
University of Vienna
Author Profile
Ernst-Detlef Schulze
Max Planck Institute for Biogeochemistry
Author Profile
Thomas Denk
Swedish Museum of Natural History
Author Profile
Marco Simeone
Università della Tuscia
Author Profile

Peer review status:UNDER REVIEW

24 Mar 2020Submitted to Molecular Ecology Resources
14 Apr 2020Assigned to Editor
14 Apr 2020Submission Checks Completed
18 May 2020Reviewer(s) Assigned

Abstract

Measuring biological diversity is a crucial but difficult undertaking, as exemplified in oaks where complex morphological, ecological, biogeographic and genetic differentiation patterns collide with traditional taxonomy that measures biodiversity in number of species (or higher taxa). In this pilot study, we generated High-Throughput Sequencing (HTS) amplicon data of the intergenic spacer of the 5S nuclear ribosomal DNA cistron (5S-IGS) in oaks, using six mock samples that differ in geographic origin, species composition, and pool complexity. The potential of the marker for automated geno-taxonomy applications was assessed using a reference dataset of 1770 5S-IGS cloned sequences, covering the entire taxonomic breadth and distribution range of western Eurasian Quercus, and applying similarity (BLAST) and evolutionary approaches (ML trees and EPA). Both methods performed equally well, with correct identification of species in sections Ilex and Cerris in the pure and mixed samples and main genotypes shared by species of sect. Quercus. Application of different cut-off thresholds revealed that medium-high abundance sequences (>10 or 25) suffice for a net species identification of samples containing one or few individuals. Lower thresholds identify phylogenetic correspondence with all target species in highly mixed samples (analogue to environmental bulk samples) and include rare variants pointing towards reticulation, incomplete lineage sorting, pseudogenic 5S units, and in-situ (natural) contamination. Our pipeline is highly promising for future assessments of intra-specific and inter-population diversity, and of the genetic resources of natural ecosystems, which are fundamental to empower fast and solid biodiversity conservation programs worldwide.