Introduction
The last 50 years witnessed an explosion in the human population, which
has been supported by a three-fold global expansion in crop production
(FAO’s Statistical Yearbook 2013). Rice, maize, and wheat, together with
some other staple crops, have been key for this expansion.
The
rapid increase in crop production has been achieved largely through
higher yields per unit and crop intensification. Creation of
higher-yielding crop varieties requires specific genes from the gene
pool of the crop species and/or its close relatives, such as the
semidwarfing gene in rice (sd-1 ) and Rht1 and Rht2in wheat (Gale & Marshall, 1973; Jennings, 1964). Genetic resources are
fundamental for cultivar improvement; however, most crops have suffered
a loss of genetic diversity following prolonged domestication. For
example, bread wheat, which originated some 8000 years ago in the
Fertile Crescent, has undergone several rounds of genetic erosion (Jia
et al., 2013). Genetic resources of crops and their close relatives were
initially conserved ex situ in seed banks worldwide and laterin situ in their homelands or nearby areas. With intense
reclamation of arable land, more and more wild forms of crops and their
close relatives have been lost, increasing our reliance on germplasms
housed in seed banks. However, seeds in seed banks may be mislabeled due
to (1) incorrect species taxonomy, (2) lack of diagnostic morphological
parameters, and (3) contamination with old material. Therefore,
authentication of specimens is crucial to avoid compromising research
and crop production. Given that it is not easy to identify seeds based
solely on morphology, DNA barcoding has come to offer a promising
solution for discriminating between very similar materials.
First proposed in 2003 (Hebert, Cywinska, Ball, & DeWaard, 2003), DNA
barcoding has become a reliable technology to rapidly identify species
based on short DNA fragments. In 2009, the two-locus combination ofmatK +rbcL was recommended as a core barcode for the
identification of land plants (Hollingsworth et al., 2009). Following
their first mention in 2005 (Kress, Wurdack, Zimmer, Weigt, & Janzen,
2005), internal transcribed spacer of ribosomal DNA (ITS)/ITS2 andpsbA-trnH were proposed as new barcodes for land plants (Chen et
al., 2010; Li et al., 2011; Yan et al., 2015). A region of ycf1was also proposed as a barcoding target owing to its high resolution
(Dong et al., 2015). Due to unsatisfactory resolution of a single marker
in discriminating between species, various combination schemes were
assessed (Hollingsworth et al., 2009). Nowadays, the technique is
successfully used to discover cryptic species (Huemer, Karsholt, &
Mutanen, 2014; Kress et al., 2009), detect illegally traded, invasive or
endangered species (Lahaye, Van der Bank, Maurin, Duthoit, &
Savolainen, 2008), assess biodiversity (Sonstebo et al., 2010), and
identify medicinal plants in mixtures (Howard et al., 2012). Despite
these and other advancements, conventional DNA barcodes do not work in
the case of extremely closely related species or only slightly diverged
“species” from a recent radiation event (Hollingsworth, Graham, &
Little, 2011). To address such instances, a DNA super barcode was
proposed (Li et al., 2015). A DNA super barcode includes a complete
genome or parts of a genome containing enough information to
discriminate between the species of interest. The entire chloroplast or
mitochondrial genomes, combinations of many genes (or regions in a
genome), and assemblies of single nucleotide polymorphisms constitute
examples of DNA super barcodes. With the advent of super barcodes, seeds
of closely related species in seed banks can be finally assigned to the
correct species or even individual haplotypes. Rice seeds require super
barcodes, such as the entire chloroplast genome, to distinguish betweenA and C haploid genome types, which are so closely
related that they cannot be resolved using common chloroplast gene
fragments.
Rice belongs to the genus Oryza in the family Poaceae. The genus
consists of about 26 species distributed across tropical and subtropical
areas (Vaughan, 1989; Table S1). However, disputes remain regarding the
relationship between O. granulata and O. meyeriana , and
between O. schweinfurthiana and O. punctata . Oryzahas a very short evolutionary history. It diverged from Leersiasome 14 million years ago (Guo & Ge, 2005) and includes eight known
haploid
genome
types (A , B , C , E , F ,G , J , K , and L ) and two unknown
genome types (D and H ) (Aggarwal, Brar, Nandi, Huang,
& Khush, 1999). The genus has been subjected to several taxonomic
revisions but some issues persist (Liu, Yan, & Ge, 2016; Lu, Ge, Sang,
Chen, & Hong, 2001; Rougerie et al., 2014; Vaughan, 1989). For example,
the two subspecies of the Asian rice (O. sativa ), subsp.indica and subsp. japonica , are taxonomically incorrect.
Akin to African rice (O. glaberrima ), they are intermingled
morphologically and perhaps genetically with their wild progenitors or
relatives.
Cultivated rice is one of the most important cereal crops worldwide and
it feeds more than half of the world’s population (Khush, 2005). Its
wild progenitors or relatives represent precious genetic resources for
rice breeding and genetic improvement (Vaughan, Morishima, & Kadowaki,
2003; Wing et al., 2005) Established genomic tools for the molecular and
genetic study of O. sativa (Kim et al., 2008; Tang et al., 2010)
can facilitate the correct characterization of seeds and the use of
genetic resources housed in seed banks. Here, we demonstrate the
effectiveness of
a
rice chloroplast genome super barcode for identifying rice seeds from
seed banks. By employing some nuclear DNA barcodes, we also address
possible faults of using the rice chloroplast genome super barcode.