Key words: animals, cpDNA, genomics, mtDNA, plants, taxonomy
Next-generation sequencing (NGS) and genomics continue to transform how
biologists address fundamental questions in ecology and evolution. The
quantity of data that can be generated quickly and cheaply enable
researchers to interrogate genomes for hundreds or thousands of loci
that can be used for evolutionary inference. This phenomenon had led to
an important paradigm shift in how phylogenetic, phylogeographic, and
population genomic studies are designed. Historically, mitochondrial DNA
(mtDNA) was the primary molecular marker used to estimate evolutionary
history and demographic parameters in animals (Avise, Arnold, Ball,
Bermingham, Lamb, Neigel, Reeb, & Saunders, 1987; Avise, 2000), whereas
chloroplast markers were used extensively in plant phylogenetics and
phylogeography (Bonatelli, Zappi, Taylor, & Moraes, 2013; Hickerson et
al., 2010; Soltis, Gitzendanner, Strenge, & Soltis, 1997).
Subsequently, microsatellites became a popular multilocus,
fragment-based method used to investigate population structure in the
nuclear genome. More recently, methods and markers such as RADseq and
its derivatives (Baird et al., 2008; Elshire et al., 2011; Peterson et
al., 2012), ultraconserved elements (UCEs; Faircloth et al., 2012),
anchored hybrid enrichment (AHEs; Lemmon, 2012), transcriptomes, and
whole genomes have emerged to take full advantage of the power of NGS
technologies. The potential resolution provided by thousands of loci is
impressive, but not without challenges. Issues with assembly, paralogy,
variant calling, phasing, and sequencing errors can all impact
subsequent evolutionary inference. Another major issue that can impact
these genomic studies, that has not been properly addressed, is the
accurate taxonomic identification of samples. I argue that sequencing
protocols targeting mtDNA genes (animals), chloroplast DNA (cpDNA;
plants) and select nuclear genes (e.g. ITS; fungi, plants) should remain
as powerful resources in the molecular ecologist’s toolbox, even though
the limitations of these markers are well established, and there appears
to be a downward trend in using these data in modern genomic studies
(Fig. 1).
The pros and cons of mtDNA and cpDNA have been discussed in numerous
publications (Barrowclough & Zink, 2009; Edwards, 2009; Yan et al.,
2018; Zink & Barrowclough, 2008). The smaller effective population
sizes of the mtDNA genome, when coupled with a relatively high
substitution rate, is often beneficial when investigating evolutionary
processes on more recent timescales. Recently diverged groups are
expected to reach reciprocal monophyly faster with mitochondrial genes
versus nuclear genes. This phenomenon alone can provide novel insight
into mechanisms of reproductive isolation and speciation, particularly
when combined with powerful analytical tools for species delimitation
(Blair & Bryson, 2017; Fujisawa & Barraclough, 2013; Kapli et al.,
2017; Talavera et al., 2013). Two classic North American examples that
highlight the utility of mtDNA for barrier detection and lineage
divergence include the multiple genetic breaks across the Peninsula of
Baja California (Blair, 2009; Lindell et al., 2005, 2006, 2008; Upton &
Murphy, 1997) and differentiation across the Continental Divide/Cochise
Filter Barrier (Castoe et al., 2007; Myers et al., 2017). Many of these
patterns have been subsequently corroborated with genomic data
(Harrington et al., 2018; Schield et al., 2015), highlighting the
benefit of mtDNA for generating primary evolutionary hypotheses. Several
taxonomic groups also show evidence of mtDNA or cpDNA introgression
(Çoraman et al., 2020; Leaché & Cole, 2007; Leaché & McGuire, 2006;
Mastrantonio et al., 2016; Vitelli et al., 2016; Yan et al.,
2018)—evolutionary patterns that would be invisible with only nuclear
DNA (nDNA). Another major benefit of sequencing mtDNA, cpDNA, and select
nuclear genes (e.g. ITS) is the vast quantity of homologous data
available in databases, made, in part, from the DNA barcoding initiative
(Hebert, 2004; Hebert, Cywinska, & Ball, 2003; Hebert & Gregory,
2005). Although DNA barcoding and single locus studies have been
discussed and criticized repeatedly in the literature (Hickerson, Meyer,
& Moritz, 2006; Moritz & Cicero, 2004; Will, Mishler, & Wheeler,
2005), there is a continuing utility in sequencing rapidly evolving
and/or highly discriminatory loci, at the least to help generate new
evolutionary and taxonomic hypotheses (DeSalle & Goldstein, 2019;
Honeycutt, 2021). Finally, it is simply relatively easy to amplify and
sequence mtDNA and cpDNA in a standard molecular laboratory due to high
copy numbers and the availability of primers, making the work feasible
for faculty even at moderately funded institutions.
Although the benefits of mtDNA and cpDNA are well-known and generally
appreciated, a major limitation of these markers is that they are
composed of linked genes that seldom undergo recombination. From an
analytical perspective, this essentially means that the entire plastid
or mitochondrial genome should be treated as a single locus, in turn,
providing a very narrow view on the evolutionary history of the group.
In short, a mtDNA or cpDNA gene tree may or may not be congruent with
the species tree due to processes such as the stochasticity of the
coalescent process (i.e. incomplete lineage sorting; ILS),
introgression, horizontal gene transfer, and gene duplication or loss.
Thus, in many cases organellar sequences will not be a reliable
indicator of taxonomy, and some nuclear loci will be required. The
strength of NGS and methods like ddRADseq and UCEs is that they provide
thousands of essentially independent coalescent histories with which
researchers can estimate important evolutionary parameters such as
species limits, species trees, divergences times, patterns and rates of
gene flow, selection, and population size changes (Andrews et al., 2016;
Baca et al., 2017; Blair et al., 2019; Bryson et al., 2017; Davey &
Blaxter, 2010; Finger et al., 2022).
Because of the perpetual decrease in DNA sequencing costs, along with
the realization of the limitations of single locus studies, more
contemporary publications are relying solely on genomic data for
evolutionary inference (Crawford et al., 2012; Dam et al., 2017; D. A.
Eaton & Ree, 2013; Hou et al., 2015; Villaverde et al., 2021; Wagner et
al., 2013), although exceptions do exist (Finger et al., 2022; Kato et
al., 2020; Zarza et al., 2018). I will argue that this is not an optimal
use of resources, given the potential benefits of organellar markers
alluded to above. Another benefit that has not been properly discussed
considering genomics is how these markers can be used to help verify
taxonomic identities prior to downstream analysis. This is particularly
relevant given the widespread nature of cryptic species in many
taxonomic groups. When working with mtDNA and cpDNA, an optimal approach
following quality trimming and assembly is to BLAST each sequence to
verify both the correct gene and taxonomic identity. This is made
possible by the decades of sequence data that have been deposited in
databases such as GenBank and the Barcode of Life Data System (BOLD;
http://v4.boldsystems.org/; Ratnasingham & Hebert, 2007). Note that
this approach is currently impossible with reduced representation
genomic data from non-model species. Instead, researchers must assume
that they sequenced what they thought they were sequencing. Anomalous
samples can possibly be identified in subsequent analyses (e.g.
phylogenetic and/or clustering analysis), but this approach will not
reveal the true identity of the individuals. Moreover, an anomalous,
highly divergent sequence might erroneously suggest an unknown, cryptic
species. This would lead to the researcher spending additional time and
resources trying to decipher the true history and taxonomy of the
sample. In a worst-case scenario, one might describe the taxon as a new
species simply based on results of the molecular analysis. It should be
noted, however, that databases like GenBank can and do contain errors
(Ashelford et al., 2005; Bridge et al., 2003; Meiklejohn et al., 2019;
Mulcahy et al., 2022), and processes such as mtDNA introgression and ILS
could lead to incorrect species identities. In addition, taxonomic
biases exist in these databases. As of January 2023, BOLD contained
sequences for 246k species of animals, 72k species of plants, and only
24k species of fungi and “other species” such as protists.
Obtaining samples for phylogeographic and population genomic studies can
be challenging due to funding limitations, logistics, geography, lack of
personnel, and time constraints, particularly for researchers at
underfunded and primarily undergraduate institutions. Thus, it is common
for researchers to reach out to natural history museums to inquire about
tissue loans to help facilitate investigations. These museums represent
a vitally important record of life on Earth that have enabled ecological
and evolutionary research for centuries. Unfortunately, taxonomic
identifications made by collectors in the field are not always accurate,
and museum databases may contain errors and/or outdated information
(Boessenkool et al., 2010; Mulcahy et al., 2022; Scott et al., 2019).
This is the case even with relatively well-known groups distributed
throughout North America. Given that most loans for molecular studies
are tissues only, researchers have little information available to
verify taxonomic identity. By sequencing and analyzing common mtDNA,
cpDNA and nuclear genes in addition to genomic data, researchers can
verify the correct species ID of each sample through a simple BLAST or
BOLD search (https://ibol.org/, http://v4.boldsystems.org/).
Note that the sequenced genes do not have to be limited to the barcoding
markers, as sequences for many other genes are widely available across
the Tree of Life (e.g. the mitochondrial ND4 gene has been sequenced in
squamate reptiles for decades; Arevalo et al., 1994).
Although amplifying and Sanger sequencing an additional gene (or pair of
genes for plants if following barcoding recommendations) adds to the
overall project cost and time commitment, I argue that the cost is well
worth the potential gain. Depending on the available infrastructure,
amplifying and sequencing a single mtDNA gene for a plate (96 samples)
should add ~$1000-2000 to the overall cost of the
project. Even though the cost of Sanger sequencing is relatively high on
a per base basis, the amount of additional information it can provide is
vast, and can reassure researchers that they are working with the
species of interest. These data can also be used to investigate deep
coalescence, introgression, sex-biased dispersal, patterns of selection,
and putative species limits. I will note that in many cases
(particularly in animals), it will not be necessary to sequence multiple
genes, as the additional benefit likely declines substantially after the
first locus. This is due, in part, to the linked nature of organellar
genomes as discussed above. Additional genes can potentially provide
more phylogenetically informative characters, but each gene will
inherently reflect the same history. Finally, the relatively simple
process of amplifying and sequencing a single informative gene can help
fix errors in museum databases through a quick correspondence with
curators (Mulcahy et al., 2022).
Although Sanger sequencing a single organellar locus for up to a plate
may be justified in some instances, the total project cost can quickly
become prohibitive if additional loci and individuals are needed. Based
on current price estimates (January 2023), the total cost of amplifying
and sequencing a second cytoplasmic or nuclear locus using Sanger
technology (96 samples, excluding personnel costs) would likely meet or
surpass the cost of a GBS/ddRADseq protocol offered by commercial labs
(e.g. https://biotech.wisc.edu/). Thus, there has been recent interest
in mining genomic data sets for organellar loci at no additional
sequencing cost. Organellar markers are often sequenced in genomics
studies as “by-catch” with certain library preparation techniques
(Allio et al., 2020). Whole genome sequencing will inherently capture
the organellar genome in addition to nuclear sequences, and researchers
should make full use of these data by assembling and annotating these
genomes. Perhaps more surprisingly, targeted sequence capture methods
like UCEs and AHEs also tend to return organellar sequences, which
should be fully utilized in subsequent evolutionary analyses (Amaral et
al., 2015; Caparroz et al., 2018; Derkarabetian et al., 2019; Lemmon,
2012; Lyra et al., 2017; Miller et al., 2022; Simon et al., 2019; Zarza
et al., 2018). Finally, several RADseq/GBS studies have isolated and
analyzed organellar markers to provide a more comprehensive perspective
on evolutionary history (Du et al., 2020; Meger et al., 2019; Stobie et
al., 2019). Some of the traditionally used bioinformatics packages such
as Stacks 2 (Rochette et al., 2019) and ipyrad (D. A. R. Eaton &
Overcast, 2020) can be used to isolate mtDNA and cpDNA sequences from
raw reads, while newer tools are beginning to emerge that are designed
specifically for this task (Laczkó et al., 2022). However, obtaining
organellar loci from RADseq libraries poses numerous challenges. For
example, the number and types of loci recovered will be inherently based
on characteristics of the organellar genomes under study and the
restriction enzymes used (Laczkó et al., 2022). Second, allelic dropout
due to mutations in restriction enzyme cut sites can result in a highly
fragmented data set and possibly lead to biased diversity estimates
(Gautier et al., 2013) . This will be particularly problematic when
working with animal mtDNA genomes given the fast rate of molecular
evolution in many groups, resulting in some samples with little data to
help verify identity. Third, the loci recovered may have limited
taxonomic utility due to sequence conservation and/or the absence of
homologous sequences in public repositories. Thus, in theory there may
be little need to use Sanger-based technology to obtain organellar
sequences in modern genomic studies, although more work is needed that
focuses on optimal library preparation techniques and bioinformatics
protocols. Note that there are currently several tools available to help
researchers plan and design RADseq studies that can help target
organellar markers (Chafin et al., 2018; Melo & Hale, 2019;
Rivera-Colón et al., 2021). Another alternative to minimize cost may be
to Sanger sequence a small proportion of anomalous samples identified
through phylogenetic and population genetic analyses. An obvious
weakness of this approach is that organellar data would not be available
for all samples, limiting the additional utility provided by these
markers.
One could argue that instead of sequencing organellar loci to help
confirm taxonomic identity, low coverage whole genome sequences could be
obtained for each individual. Given that sequencing costs will continue
to decrease, it is likely that the number of whole genome sequences will
continue to rise. This is due, in part, to both advancing sequencing
technology and several consortia focused on generating whole genome
sequences for either specific taxonomic groups including vertebrates
(Koepfli et al., 2015; Rhie et al., 2021), birds (G. Zhang, 2015),
arthropods (i5K Consortium, 2013) or more broadly for all eukaryotic
life on Earth (i.e. the Earth BioGenome Project; Lewin et al., 2018). If
the ambitious goals of the Earth BioGenome Project (EBP) come to
fruition, high-quality reference genomes will be available for all
species of eukaryotes. This will no doubt serve as a tremendous resource
to a diverse group of researchers in ecology, evolution, conservation,
epidemiology, agriculture, and medicine. Unfortunately, as of June 2021,
only 0.2% of the 1.6 million animal species (Z.-Q. Zhang, 2013) had a
deposited reference genome (many with relatively low quality), with a
strong bias towards chordates (Hotaling et al., 2021). Without an
adequate reference database of whole genome sequences and associated
metadata, the potential utility of using low coverage whole genomes for
taxonomic confirmation is limited. In addition, cost would still be
prohibitive when sequencing tens or hundreds of individuals as is common
in phylogeographic and population genomic studies. Even though the cost
of a vertebrate genome can be <$1000 USD as of 2018 (Lewin et
al., 2018), this still adds up to a project cost that few can justify
and attain. This is particularly the case for researchers at
predominantly undergraduate institutions with limited access to funding,
and those at institutions in developing countries that may not have
adequate infrastructure (Hotaling et al., 2021). Lower coverage genomes
are an option to further reduce cost, at the risk of obtaining
unreliable base calls that could complicate taxonomic identification.
Organellar markers (for plants and animals) and a handful of nuclear
genes (for fungi) have been sequenced for decades and still serve a
valuable purpose for researchers in ecology, evolution, and systematics.
These markers can also have tangible conservation implications and help
control the spread and resale of threatened and endangered species
(Hobbs et al., 2019). The genomic revolution has fundamentally changed
the way that science is conducted and hypotheses are generated. However,
we should not turn our backs to the extraordinary availability of
high-quality mtDNA and cpDNA sequences that are maintained in accessible
repositories. These publicly available sequences are a vital resource
that can be used to help confirm the taxonomic identity of any sample
sequenced using NGS technology. Sequencing and analyzing markers such as
mtDNA can also add additional information to help illuminate species
limits, divergence times, dispersal patterns, introgression, and
selection. Until high quality reference genomes become available for
more of Earth’s biodiversity, and the cost of whole genome sequencing
decreases to the point where it becomes feasible for a broader
proportion of researchers to re-sequence a large number of individuals,
sequencing and analyzing mtDNA and cpDNA should remain as a cost
efficient and relatively simple approach to remain confident that the
newly generated genomic data are from the taxon of interest.
Acknowledgments
CB would like to thank Erika Crispo, Nolan Kane, Loren Rieseberg, and
two anonymous reviewers for helpful comments that improved the
manuscript, and NSF for their support (DEB-1929679).
Data Accessibility
There are no new data to report in this study.
References
Allio, R., Schomaker-Bastos, A., Romiguier, J., Prosdocimi, F., Nabholz,
B., & Delsuc, F. (2020). MitoFinder: Efficient automated large-scale
extraction of mitogenomic data in target enrichment phylogenomics.Molecular Ecology Resources , 20 (4), 892–905.
https://doi.org/10.1111/1755-0998.13160
Amaral, F. R. do, Neves, L. G., Jr, M. F. R. R., Mobili, F., Miyaki, C.
Y., Pellegrino, K. C. M., & Biondo, C. (2015). Ultraconserved Elements
Sequencing as a Low-Cost Source of Complete Mitochondrial Genomes and
Microsatellite Markers in Non-Model Amniotes. PLOS ONE ,10 (9), e0138446. https://doi.org/10.1371/journal.pone.0138446
Andrews, K. R., Good, J. M., Miller, M. R., Luikart, G., & Hohenlohe,
P. A. (2016). Harnessing the power of RADseq for ecological and
evolutionary genomics. Nature Reviews Genetics , 17 (2),
Article 2. https://doi.org/10.1038/nrg.2015.28
Arevalo, E., Davis, S. K., & Sites, J. W. (1994). Mitochondrial DNA
Sequence Divergence and Phylogenetic Relationships among Eight
Chromosome Races of the Sceloporus grammicus Complex (Phrynosomatidae)
in Central Mexico. Systematic Biology , 43 (3), 387–418.
https://doi.org/10.2307/2413675
Ashelford, K. E., Chuzhanova, N. A., Fry, J. C., Jones, A. J., &
Weightman, A. J. (2005). At least 1 in 20 16S rRNA sequence records
currently held in public repositories is estimated to contain
substantial anomalies. Applied and Environmental Microbiology ,71 (12), 7724–7736.
https://doi.org/10.1128/AEM.71.12.7724-7736.2005
Avise, J. C. (2000). Phylogeography: The History and Formation of
Species . Harvard University Press.
Avise, J. C., J. Arnold, R. M. Ball, E. Bermingham, T. Lamb, J. E.
Neigel, C. A. Reeb, and N. C. Saunders. (1987). Intraspecific
Phylogeography: The mitochondrial DNA bridge between population genetics
and systematics. Annual Review of Ecology and Systematics ,18 , 489–522.
Baca, S. M., Alexander, A., Gustafson, G. T., & Short, A. E. Z. (2017).
Ultraconserved elements show utility in phylogenetic inference of
Adephaga (Coleoptera) and suggest paraphyly of ‘Hydradephaga.’Systematic Entomology , 42 (4), Article 4.
https://doi.org/10.1111/syen.12244
Baird, N. A., Etter, P. D., Atwood, T. S., Currey, M. C., Shiver, A. L.,
Lewis, Z. A., Selker, E. U., Cresko, W. A., & Johnson, E. A. (2008).
Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers.PLOS ONE , 3 (10), Article 10.
https://doi.org/10.1371/journal.pone.0003376
Barrowclough, G. F., & Zink, R. M. (2009). Funds enough, and time:
MtDNA, nuDNA and the discovery of divergence. Molecular Ecology ,18 (14), Article 14.
Blair. (2009). Molecular phylogenetics and taxonomy of leaf-toed geckos
(Phyllodactylidae: Phyllodactylus) inhabiting the peninsula of Baja
California. Zootaxa , 2027 , 28–42.
Blair, C., & Bryson, R. W. (2017). Cryptic diversity and discordance in
single-locus species delimitation methods within horned lizards
(Phrynosomatidae: Phrynosoma). Molecular Ecology Resources ,17 (6), 1168–1182. https://doi.org/10.1111/1755-0998.12658
Blair, C., Bryson, R. W., Linkem, C. W., Lazcano, D., Klicka, J., &
McCormack, J. E. (2019). Cryptic diversity in the Mexican highlands:
Thousands of UCE loci help illuminate phylogenetic relationships,
species limits and divergence times of montane rattlesnakes (Viperidae:
Crotalus). Molecular Ecology Resources , 19 (2), Article 2.
https://doi.org/10.1111/1755-0998.12970
Boessenkool, S., Star, B., Scofield, R. P., Seddon, P. J., & Waters, J.
M. (2010). Lost in translation or deliberate falsification? Genetic
analyses reveal erroneous museum data for historic penguin specimens.Proceedings of the Royal Society B: Biological Sciences ,277 (1684), 1057–1064. https://doi.org/10.1098/rspb.2009.1837
Bonatelli, I. a. S., Zappi, D. C., Taylor, N. P., & Moraes, E. M.
(2013). Usefulness of cpDNA markers for phylogenetic and phylogeographic
analyses of closely related cactus species. Genetics and Molecular
Research: GMR , 12 (4), 4579–4585.
https://doi.org/10.4238/2013.February.28.27
Bridge, P. D., Roberts, P. J., Spooner, B. M., & Panchal, G. (2003). On
the unreliability of published DNA sequences. The New
Phytologist , 160 (1), 43–48.
https://doi.org/10.1046/j.1469-8137.2003.00861.x
Bryson, R. W., Linkem, C. W., Pavón-Vázquez, C. J., Nieto-Montes de Oca,
A., Klicka, J., & McCormack, J. E. (2017). A phylogenomic perspective
on the biogeography of skinks in the Plestiodon brevirostris group
inferred from target enrichment of ultraconserved elements.Journal of Biogeography , 44 (9), Article 9.
https://doi.org/10.1111/jbi.12989
Caparroz, R., Rocha, A. V., Cabanne, G. S., Tubaro, P., Aleixo, A.,
Lemmon, E. M., & Lemmon, A. R. (2018). Mitogenomes of two neotropical
bird species and the multiple independent origin of mitochondrial gene
orders in Passeriformes. Molecular Biology Reports , 45 (3),
279–285. https://doi.org/10.1007/s11033-018-4160-5
Castoe, T. A., Spencer, C. L., & Parkinson, C. L. (2007).
Phylogeographic structure and historical demography of the western
diamondback rattlesnake (Crotalus atrox): A perspective on North
American desert biogeography. Molecular Phylogenetics and
Evolution , 42 (1), Article 1.
https://doi.org/10.1016/j.ympev.2006.07.002
Chafin, T. K., Martin, B. T., Mussmann, S. M., Douglas, M. R., &
Douglas, M. E. (2018). FRAGMATIC: In silico locus prediction and its
utility in optimizing ddRADseq projects. Conservation Genetics
Resources , 10 (3), 325–328.
https://doi.org/10.1007/s12686-017-0814-1
Çoraman, E., Dundarova, H., Dietz, C., & Mayer, F. (2020). Patterns of
mtDNA introgression suggest population replacement in Palaearctic
whiskered bat species. Royal Society Open Science , 7 (6),
191805. https://doi.org/10.1098/rsos.191805
Crawford, N. G., Faircloth, B. C., McCormack, J. E., Brumfield, R. T.,
Winker, K., & Glenn, T. C. (2012). More than 1000 ultraconserved
elements provide evidence that turtles are the sister group of
archosaurs. Biology Letters , 8 (5), Article 5.
Dam, M. H. V., Lam, A. W., Sagata, K., Gewa, B., Laufa, R., Balke, M.,
Faircloth, B. C., & Riedel, A. (2017). Ultraconserved elements (UCEs)
resolve the phylogeny of Australasian smurf-weevils. PLOS ONE ,12 (11), Article 11. https://doi.org/10.1371/journal.pone.0188044
Davey, J. W., & Blaxter, M. L. (2010). RADSeq: Next-generation
population genetics. Briefings in Functional Genomics ,9 (5–6), Article 5–6.
Derkarabetian, S., Benavides, L. R., & Giribet, G. (2019). Sequence
capture phylogenomics of historical ethanol-preserved museum specimens:
Unlocking the rest of the vault. Molecular Ecology Resources ,19 (6), 1531–1544. https://doi.org/10.1111/1755-0998.13072
DeSalle, R., & Goldstein, P. (2019). Review and Interpretation of
Trends in DNA Barcoding. Frontiers in Ecology and Evolution ,7 . https://www.frontiersin.org/articles/10.3389/fevo.2019.00302
Du, Z.-Y., Harris, A. J., & Xiang, Q.-Y. J. (2020). Phylogenomics,
co-evolution of ecological niche and morphology, and historical
biogeography of buckeyes, horsechestnuts, and their relatives
(Hippocastaneae, Sapindaceae) and the value of RAD-Seq for deep
evolutionary inferences back to the Late Cretaceous. Molecular
Phylogenetics and Evolution , 145 , 106726.
https://doi.org/10.1016/j.ympev.2019.106726
Eaton, D. A. R., & Overcast, I. (2020). ipyrad: Interactive assembly
and analysis of RADseq datasets. Bioinformatics , 36 (8),
Article 8. https://doi.org/10.1093/bioinformatics/btz966
Eaton, D. A., & Ree, R. H. (2013). Inferring phylogeny and
introgression using RADseq data: An example from flowering plants
(Pedicularis: Orobanchaceae). Systematic Biology , 62 (5),
Article 5.
Edwards. (2009). Looking forwards or looking backwards in avian
phylogeography? A comment to Zink and Barraclough 2008. Molecular
Ecology , 18 , 2930–2933.
Elshire, R. J., Glaubitz, J. C., Sun, Q., Poland, J. A., Kawamoto, K.,
Buckler, E. S., & Mitchell, S. E. (2011). A robust, simple
genotyping-by-sequencing (GBS) approach for high diversity species.PloS One , 6 (5), Article 5.
https://doi.org/10.1371/journal.pone.0019379
Faircloth, B. C., McCormack, J. E., Crawford, N. G., Harvey, M. G.,
Brumfield, R. T., & Glenn, T. C. (2012). Ultraconserved elements anchor
thousands of genetic markers spanning multiple evolutionary timescales.Systematic Biology , 61 (5), Article 5.
https://doi.org/10.1093/sysbio/sys004
Finger, N., Farleigh, K., Bracken, J. T., Leaché, A. D., François, O.,
Yang, Z., Flouri, T., Charran, T., Jezkova, T., Williams, D. A., &
Blair, C. (2022). Genome-Scale Data Reveal Deep Lineage Divergence and a
Complex Demographic History in the Texas Horned Lizard (Phrynosoma
cornutum) throughout the Southwestern and Central United States.Genome Biology and Evolution , 14 (1), evab260.
https://doi.org/10.1093/gbe/evab260
Fujisawa, T., & Barraclough, T. G. (2013). Delimiting species using
single-locus data and the generalized mixed yule coalescent (GMYC)
approach: A revised method and evaluation on simulated datasets.Systematic Biology , syt033.
Gautier, M., Gharbi, K., Cezard, T., Foucaud, J., Kerdelhué, C., Pudlo,
P., Cornuet, J.-M., & Estoup, A. (2013). The effect of RAD allele
dropout on the estimation of genetic variation within and between
populations. Molecular Ecology , 22 (11), 3165–3178.
https://doi.org/10.1111/mec.12089
Harrington, S. M., Hollingsworth, B. D., Higham, T. E., & Reeder, T. W.
(2018). Pleistocene climatic fluctuations drive isolation and secondary
contact in the red diamond rattlesnake (Crotalus ruber) in Baja
California. Journal of Biogeography , 45 (1), Article 1.
https://doi.org/10.1111/jbi.13114
Hebert. (2004). Ten species in one: DNA barcoding revealscryptic species
in the neotropical skipper butterfly Astraptes fulgerator. PNAS ,101 (41), Article 41.
Hebert, P. D., Cywinska, A., & Ball, S. L. (2003). Biological
identifications through DNA barcodes. Proceedings of the Royal
Society of London. Series B: Biological Sciences , 270 (1512),
Article 1512.
Hebert, P. D. N., & Gregory, T. R. (2005). The Promise of DNA Barcoding
for Taxonomy. Systematic Biology , 54 (5), Article 5.
https://doi.org/10.1080/10635150500354886
Hickerson, M., Carstens, B., Cavender-Bares, J., Crandall, K., Graham,
C., Johnson, J., Rissler, L., Victoriano, P., & Yoder, A. (2010).
Phylogeography’s past, present, and future: 10 years after.Molecular Phylogenetics and Evolution , 54 (1), Article 1.
Hickerson, M. J., Meyer, C. P., & Moritz, C. (2006). DNA barcoding will
often fail to discover new animal species over broad parameter space.Systematic Biology , 55 (5), 729–739.
https://doi.org/10.1080/10635150600969898
Hobbs, C. A. D., Potts, R. W. A., Bjerregaard Walsh, M., Usher, J., &
Griffiths, A. M. (2019). Using DNA Barcoding to Investigate Patterns of
Species Utilisation in UK Shark Products Reveals Threatened Species on
Sale. Scientific Reports , 9 (1), Article 1.
https://doi.org/10.1038/s41598-018-38270-3
Honeycutt, R. L. (2021). Editorial: DNA Barcodes: Controversies,
Mechanisms, and Future Applications. Frontiers in Ecology and
Evolution , 9 .
https://www.frontiersin.org/articles/10.3389/fevo.2021.718865
Hotaling, S., Kelley, J. L., & Frandsen, P. B. (2021). Toward a genome
sequence for every animal: Where are we now? Proceedings of the
National Academy of Sciences , 118 (52), e2109019118.
https://doi.org/10.1073/pnas.2109019118
Hou, Y., Nowak, M. D., Mirré, V., Bjorå, C. S., Brochmann, C., & Popp,
M. (2015). Thousands of RAD-seq Loci Fully Resolve the Phylogeny of the
Highly Disjunct Arctic-Alpine Genus Diapensia (Diapensiaceae).PLoS ONE , 10 (10), e0140175.
https://doi.org/10.1371/journal.pone.0140175
i5K Consortium. (2013). The i5K Initiative: Advancing arthropod genomics
for knowledge, human health, agriculture, and the environment. The
Journal of Heredity , 104 (5), 595–600.
https://doi.org/10.1093/jhered/est050
Kapli, P., Lutteropp, S., Zhang, J., Kobert, K., Pavlidis, P.,
Stamatakis, A., & Flouri, T. (2017). Multi-rate Poisson tree processes
for single-locus species delimitation under maximum likelihood and
Markov chain Monte Carlo. Bioinformatics (Oxford, England) ,33 (11), Article 11. https://doi.org/10.1093/bioinformatics/btx025
Kato, D., Suzuki, H., Tsuruta, A., Maeda, J., Hayashi, Y., Arima, K.,
Ito, Y., & Nagano, Y. (2020). Evaluation of the population structure
and phylogeography of the Japanese Genji firefly, Luciola cruciata, at
the nuclear DNA level using RAD-Seq analysis. Scientific Reports ,10 , 1533. https://doi.org/10.1038/s41598-020-58324-9
Koepfli, K.-P., Paten, B., Genome 10K Community of Scientists, &
O’Brien, S. J. (2015). The Genome 10K Project: A way forward.Annual Review of Animal Biosciences , 3 , 57–111.
https://doi.org/10.1146/annurev-animal-090414-014900
Laczkó, L., Jordán, S., & Sramkó, G. (2022). The RadOrgMiner pipeline:
Automated genotyping of organellar loci from RADseq data. Methods
in Ecology and Evolution , 13 (9), 1962–1975.
https://doi.org/10.1111/2041-210X.13937
Leaché, A. D., & Cole, C. J. (2007). Hybridization between multiple
fence lizard lineages in an ecotone: Locally discordant variation in
mitochondrial DNA, chromosomes, and morphology. Molecular
Ecology , 16 (5), 1035–1054.
https://doi.org/10.1111/j.1365-294X.2006.03194.x
Leaché, A. D., & McGuire, J. A. (2006). Phylogenetic relationships of
horned lizards (Phrynosoma) based on nuclear and mitochondrial data:
Evidence for a misleading mitochondrial gene tree. Molecular
Phylogenetics and Evolution , 39 (3), Article 3.
https://doi.org/10.1016/j.ympev.2005.12.016
Lemmon. (2012). Ancored hybrid enrichment for massively high-throughput
phylogenomics. Systematic Biology , 1–18.
Lewin, H. A., Robinson, G. E., Kress, W. J., Baker, W. J., Coddington,
J., Crandall, K. A., Durbin, R., Edwards, S. V., Forest, F., Gilbert, M.
T. P., Goldstein, M. M., Grigoriev, I. V., Hackett, K. J., Haussler, D.,
Jarvis, E. D., Johnson, W. E., Patrinos, A., Richards, S.,
Castilla-Rubio, J. C., … Zhang, G. (2018). Earth BioGenome
Project: Sequencing life for the future of life. Proceedings of
the National Academy of Sciences of the United States of America ,115 (17), 4325–4333. https://doi.org/10.1073/pnas.1720115115
Lindell, J., Méndez-de la Cruz, F. R., & Murphy, R. W. (2005). Deep
genealogical history without population differentiation: Discordance
between mtDNA and allozyme divergence in the zebra-tailed lizard
(Callisaurus draconoides). Molecular Phylogenetics and Evolution ,36 (3), Article 3. https://doi.org/10.1016/j.ympev.2005.04.031
Lindell, J., Méndez-De La Cruz, F. R., & Murphy, R. W. (2008). Deep
biogeographical history and cytonuclear discordance in the black-tailed
brush lizard (Urosaurus nigricaudus) of Baja California.Biological Journal of the Linnean Society , 94 (1), Article
1. https://doi.org/10.1111/j.1095-8312.2008.00976.x
Lindell, J., Ngo, A., & Murphy, R. W. (2006). Deep genealogies and the
mid-peninsular seaway of Baja California. Journal of
Biogeography , 33 (8), Article 8.
https://doi.org/10.1111/j.1365-2699.2006.01532.x
Lyra, M. L., Joger, U., Schulte, U., Slimani, T., El Mouden, E. H.,
Bouazza, A., Künzel, S., Lemmon, A. R., Lemmon, E. M., & Vences, M.
(2017). The mitochondrial genomes of Atlas Geckos (Quedenfeldtia):
Mitogenome assembly from transcriptomes and anchored hybrid enrichment
datasets. Mitochondrial DNA. Part B, Resources , 2 (1),
356–358. https://doi.org/10.1080/23802359.2017.1339212
Mastrantonio, V., Porretta, D., Urbanelli, S., Crasta, G., & Nascetti,
G. (2016). Dynamics of mtDNA introgression during species range
expansion: Insights from an experimental longitudinal study.Scientific Reports , 6 (1), Article 1.
https://doi.org/10.1038/srep30355
Meger, J., Ulaszewski, B., Vendramin, G. G., & Burczyk, J. (2019).
Using reduced representation libraries sequencing methods to identify
cpDNA polymorphisms in European beech (Fagus sylvatica L). Tree
Genetics & Genomes , 15 (1), 7.
https://doi.org/10.1007/s11295-018-1313-6
Meiklejohn, K. A., Damaso, N., & Robertson, J. M. (2019). Assessment of
BOLD and GenBank – Their accuracy and reliability for the
identification of biological materials. PLOS ONE , 14 (6),
e0217084. https://doi.org/10.1371/journal.pone.0217084
Melo, A. T. O., & Hale, I. (2019). Expanded functionality, increased
accuracy, and enhanced speed in the de novo genotyping-by-sequencing
pipeline GBS-SNP-CROP. Bioinformatics (Oxford, England) ,35 (17), 3215. https://doi.org/10.1093/bioinformatics/bty1073
Miller, C. D., Forthman, M., Miller, C. W., & Kimball, R. T. (2022).
Extracting ‘legacy loci’ from an invertebrate sequence capture data set.Zoologica Scripta , 51 (1), 14–31.
https://doi.org/10.1111/zsc.12513
Moritz, C., & Cicero, C. (2004). DNA Barcoding: Promise and Pitfalls.PLoS Biology , 2 (10), e354.
https://doi.org/10.1371/journal.pbio.0020354
Mulcahy, D. G., Ibáñez, R., Jaramillo, C. A., Crawford, A. J., Ray, J.
M., Gotte, S. W., Jacobs, J. F., Wynn, A. H., Gonzalez-Porter, G. P.,
McDiarmid, R. W., Crombie, R. I., Zug, G. R., & Queiroz, K. de. (2022).
DNA barcoding of the National Museum of Natural History reptile tissue
holdings raises concerns about the use of natural history collections
and the responsibilities of scientists in the molecular age. PLOS
ONE , 17 (3), e0264930.
https://doi.org/10.1371/journal.pone.0264930
Myers, E. A., Hickerson, M. J., & Burbrink, F. T. (2017). Asynchronous
diversification of snakes in the North American warm deserts.Journal of Biogeography , 44 (2), Article 2.
https://doi.org/10.1111/jbi.12873
Peterson, B. K., Weber, J. N., Kay, E. H., Fisher, H. S., & Hoekstra,
H. E. (2012). Double digest RADseq: An inexpensive method for de novo
SNP discovery and genotyping in model and non-model species. PloS
One , 7 (5), Article 5.
Ratnasingham, S., & Hebert, P. D. N. (2007). bold: The Barcode of Life
Data System (http://www.barcodinglife.org). Molecular Ecology
Notes , 7 (3), 355–364.
https://doi.org/10.1111/j.1471-8286.2007.01678.x
Rhie, A., McCarthy, S. A., Fedrigo, O., Damas, J., Formenti, G., Koren,
S., Uliano-Silva, M., Chow, W., Fungtammasan, A., Kim, J., Lee, C., Ko,
B. J., Chaisson, M., Gedman, G. L., Cantin, L. J., Thibaud-Nissen, F.,
Haggerty, L., Bista, I., Smith, M., … Jarvis, E. D. (2021).
Towards complete and error-free genome assemblies of all vertebrate
species. Nature , 592 (7856), Article 7856.
https://doi.org/10.1038/s41586-021-03451-0
Rivera-Colón, A. G., Rochette, N. C., & Catchen, J. M. (2021).
Simulation with RADinitio improves RADseq experimental design and sheds
light on sources of missing data. Molecular Ecology Resources ,21 (2), 363–378. https://doi.org/10.1111/1755-0998.13163
Rochette, N. C., Rivera-Colón, A. G., & Catchen, J. M. (2019). Stacks
2: Analytical methods for paired-end sequencing improve RADseq-based
population genomics. Molecular Ecology , 28 (21),
4737–4754. https://doi.org/10.1111/mec.15253
Schield, D. R., Card, D. C., Adams, R. H., Jezkova, T., Reyes-Velasco,
J., Proctor, F. N., Spencer, C. L., Herrmann, H.-W., Mackessy, S. P., &
Castoe, T. A. (2015). Incipient speciation with biased gene flow between
two lineages of the Western Diamondback Rattlesnake (Crotalus atrox).Molecular Phylogenetics and Evolution , 83 , 213–223.
https://doi.org/10.1016/j.ympev.2014.12.006
Scott, B., Baker, E., Woodburn, M., Vincent, S., Hardy, H., & Smith, V.
S. (2019). The Natural History Museum Data Portal. Database ,2019 , baz038. https://doi.org/10.1093/database/baz038
Simon, C., Gordon, E. R. L., Moulds, M. S., Cole, J. A., Haji, D.,
Lemmon, A. R., Lemmon, E. M., Kortyna, M., Nazario, K., Wade, E. J.,
Meister, R. C., Goemans, G., Chiswell, S. M., Pessacq, P., Veloso, C.,
McCutcheon, J. P., & Łukasik, P. (2019). Off-target capture data,
endosymbiont genes and morphology reveal a relict lineage that is sister
to all other singing cicadas. Biological Journal of the Linnean
Society , 128 (4), 865–886.
https://doi.org/10.1093/biolinnean/blz120
Soltis, D. E., Gitzendanner, M. A., Strenge, D. D., & Soltis, P. S.
(1997). Chloroplast DNA intraspecific phylogeography of plants from the
Pacific Northwest of North America. Plant Systematics and
Evolution , 206 (1), 353–373. https://doi.org/10.1007/BF00987957
Stobie, C. S., Cunningham, M. J., Oosthuizen, C. J., & Bloomer, P.
(2019). Finding stories in noise: Mitochondrial portraits from RAD data.Molecular Ecology Resources , 19 (1), 191–205.
https://doi.org/10.1111/1755-0998.12953
Talavera, G., Dincă, V., & Vila, R. (2013). Factors affecting species
delimitations with the GMYC model: Insights from a butterfly survey.Methods in Ecology and Evolution , 4 (12), Article 12.
https://doi.org/10.1111/2041-210X.12107
Upton, D. E., & Murphy, R. W. (1997). Phylogeny of the side-blotched
lizards (Phrynosomatidae:Uta) based on mtDNA sequences: Support for
midpeninsular seaway in Baja California. Molecular Phylogenetics
and Evolution , 8 (1), Article 1.
Villaverde, T., Maguilla, E., Luceño, M., & Hipp, A. L. (2021).
Assessing the sensitivity of divergence time estimates to locus
sampling, calibration points, and model priors in a RAD-seq phylogeny of
Carex section Schoenoxiphium. Journal of Systematics and
Evolution , 59 (4), 687–697. https://doi.org/10.1111/jse.12724
Vitelli, M., Vessella, F., Cardoni, S., Pollegioni, P., Denk, T., Grimm,
G. W., & Simeone, M. C. (2016). Phylogeographic structuring of plastome
diversity in Mediterranean oaks (Quercus Group Ilex, Fagaceae).Tree Genetics & Genomes , 13 (1), 3.
https://doi.org/10.1007/s11295-016-1086-8
Wagner, C. E., Keller, I., Wittwer, S., Selz, O. M., Mwaiko, S.,
Greuter, L., Sivasundar, A., & Seehausen, O. (2013). Genome‐wide RAD
sequence data provide unprecedented resolution of species boundaries and
relationships in the Lake Victoria cichlid adaptive radiation.Molecular Ecology , 22 (3), Article 3.
Will, K. W., Mishler, B. D., & Wheeler, Q. D. (2005). The Perils of DNA
Barcoding and the Need for Integrative Taxonomy. Systematic
Biology , 54 (5), 844–851.
https://doi.org/10.1080/10635150500354878
Yan, M., Xiong, Y., Liu, R., Deng, M., & Song, J. (2018). The
Application and Limitation of Universal Chloroplast Markers in
Discriminating East Asian Evergreen Oaks. Frontiers in Plant
Science , 9 .
https://www.frontiersin.org/articles/10.3389/fpls.2018.00569
Zarza, E., Connors, E. M., Maley, J. M., Tsai, W. L. E., Heimes, P.,
Kaplan, M., & McCormack, J. E. (2018). Combining ultraconserved
elements and mtDNA data to uncover lineage diversity in a Mexican
highland frog (Sarcohyla; Hylidae). PeerJ , 6 , e6045.
https://doi.org/10.7717/peerj.6045
Zhang, G. (2015). Bird sequencing project takes off. Nature ,522 (7554), Article 7554. https://doi.org/10.1038/522034d
Zhang, Z.-Q. (2013). Animal biodiversity: An outline of higher-level
classification and survey of taxonomic richness (Addenda 2013).Zootaxa , 3703 , 1–82.
https://doi.org/10.11646/zootaxa.3703.1.1
Zink, R. M., & Barrowclough, G. F. (2008). Mitochondrial DNA under
siege in avian phylogeography. Molecular Ecology , 17 (9),
2107–2121. https://doi.org/10.1111/j.1365-294X.2008.03737.x