Discrimination powers of conventional, rice-specific, and super
DNA barcodes
The different genome types within the Oryza genus have generally
diverged sufficiently for most molecular markers to discriminate between
them. The resolution of the various markers is tested by the presence of
more than one species per genome type. Phylogenetic methods are the most
reliable way to assign a sample to a species and the following
comparisons were based on the maximum parsimonious phylogenies of nearly
identical samples using different molecular markers, such asmatK , rbcL , psbA-trnH , ITS, NP78+R22, rice-specific
barcodes, and the super barcode. Because of narrowly or incorrectly
delimited species, molecular markers cannot discriminate between the
following species pairs: O. alta and O. grandiglumis (Bao
& Ge, 2004), O. barthii and O. glaberrima (Wang et al.,
2014), O. glumipatula and O. longistaminata , O.
granulata and O. meyeriana (Gong, Borromeo, & Lu, 2000),O. minuta and O. malampuzhaensis , O. nivara andO. sativa subsp. indica , and O. sativa subsp.japonica and O. rufipogon .
The matK gene had an aligned length of 1417 sites with 90
parsimony-informative characters when outgroups were included. This
marker failed to discriminate between species of the A ,B , and C genomes (Fig. S1).
The rbcL gene had an aligned length of 1428 sites with 50
parsimony-informative characters when outgroups were considered. This
marker also failed to discriminate between species of the A ,B , and C genomes (Fig. S2).
The psbA-trnH region had an aligned length of 515 sites with 10
parsimony-informative characters when outgroups and partial rps19were included. This marker could successfully identify only O.
brachyantha and O. sativa subsp. indica (Fig. S3).
The nuclear ITS (including 5.8S) had an aligned length of 713 sites with
162 parsimony-informative characters when outgroups were considered. The
samples used for this marker differed slightly from those subjected to
chloroplast markers because the sequences were difficult to amplify.
Only one ITS copy was detected in several allotetraploid species.
Phylogeny data based on ITS suggested that the H or Jgenome types originated from the F genome type (Fig. S4), a
finding not supported by the other two nuclear genes. The ITS failed to
discriminate between species of the A and C genome
types.
The nuclear NP78+R22 gene combination had an aligned length of 2218
sites with 722 parsimony-informative characters when outgroups were
included. This marker combination failed to discriminate between species
of the A , B , C , H , and Jgenome types (Fig. S5).
The rice-specific barcode consisted of six hypervariable chloroplast
regions and had an aligned length of 7943 sites with 603
parsimony-informative characters when outgroups were considered. This
marker combination resolved almost all species except O. punctataand O. minuta of the B genome type (Fig. S6).
Finally, the super DNA barcode of the complete chloroplast genome had an
aligned length of 145,860 sites with 5048 parsimony-informative
characters when outgroups were included. The super barcode exhibited the
highest discriminating power, resolving all species using an insensitive
but extremely reliable phylogenetic method (Fig. 1). Even though species
of genome types A and C are very closely related and
difficult to identify, the super barcode resolved them sufficiently
well. Surprisingly, the species O. rufipogon + O. sativasubsp. japonica and O. nivara + O. sativa subsp.indica were separable using the super barcode.