Key words: animals, cpDNA, genomics, mtDNA, plants, taxonomy
Next-generation sequencing (NGS) and genomics continue to transform how biologists address fundamental questions in ecology and evolution. The quantity of data that can be generated quickly and cheaply enable researchers to interrogate genomes for hundreds or thousands of loci that can be used for evolutionary inference. This phenomenon had led to an important paradigm shift in how phylogenetic, phylogeographic, and population genomic studies are designed. Historically, mitochondrial DNA (mtDNA) was the primary molecular marker used to estimate evolutionary history and demographic parameters in animals (Avise, Arnold, Ball, Bermingham, Lamb, Neigel, Reeb, & Saunders, 1987; Avise, 2000), whereas chloroplast markers were used extensively in plant phylogenetics and phylogeography (Bonatelli, Zappi, Taylor, & Moraes, 2013; Hickerson et al., 2010; Soltis, Gitzendanner, Strenge, & Soltis, 1997). Subsequently, microsatellites became a popular multilocus, fragment-based method used to investigate population structure in the nuclear genome. More recently, methods and markers such as RADseq and its derivatives (Baird et al., 2008; Elshire et al., 2011; Peterson et al., 2012), ultraconserved elements (UCEs; Faircloth et al., 2012), anchored hybrid enrichment (AHEs; Lemmon, 2012), transcriptomes, and whole genomes have emerged to take full advantage of the power of NGS technologies. The potential resolution provided by thousands of loci is impressive, but not without challenges. Issues with assembly, paralogy, variant calling, phasing, and sequencing errors can all impact subsequent evolutionary inference. Another major issue that can impact these genomic studies, that has not been properly addressed, is the accurate taxonomic identification of samples. I argue that sequencing protocols targeting mtDNA genes (animals), chloroplast DNA (cpDNA; plants) and select nuclear genes (e.g. ITS; fungi, plants) should remain as powerful resources in the molecular ecologist’s toolbox, even though the limitations of these markers are well established, and there appears to be a downward trend in using these data in modern genomic studies (Fig. 1).
The pros and cons of mtDNA and cpDNA have been discussed in numerous publications (Barrowclough & Zink, 2009; Edwards, 2009; Yan et al., 2018; Zink & Barrowclough, 2008). The smaller effective population sizes of the mtDNA genome, when coupled with a relatively high substitution rate, is often beneficial when investigating evolutionary processes on more recent timescales. Recently diverged groups are expected to reach reciprocal monophyly faster with mitochondrial genes versus nuclear genes. This phenomenon alone can provide novel insight into mechanisms of reproductive isolation and speciation, particularly when combined with powerful analytical tools for species delimitation (Blair & Bryson, 2017; Fujisawa & Barraclough, 2013; Kapli et al., 2017; Talavera et al., 2013). Two classic North American examples that highlight the utility of mtDNA for barrier detection and lineage divergence include the multiple genetic breaks across the Peninsula of Baja California (Blair, 2009; Lindell et al., 2005, 2006, 2008; Upton & Murphy, 1997) and differentiation across the Continental Divide/Cochise Filter Barrier (Castoe et al., 2007; Myers et al., 2017). Many of these patterns have been subsequently corroborated with genomic data (Harrington et al., 2018; Schield et al., 2015), highlighting the benefit of mtDNA for generating primary evolutionary hypotheses. Several taxonomic groups also show evidence of mtDNA or cpDNA introgression (Çoraman et al., 2020; Leaché & Cole, 2007; Leaché & McGuire, 2006; Mastrantonio et al., 2016; Vitelli et al., 2016; Yan et al., 2018)—evolutionary patterns that would be invisible with only nuclear DNA (nDNA). Another major benefit of sequencing mtDNA, cpDNA, and select nuclear genes (e.g. ITS) is the vast quantity of homologous data available in databases, made, in part, from the DNA barcoding initiative (Hebert, 2004; Hebert, Cywinska, & Ball, 2003; Hebert & Gregory, 2005). Although DNA barcoding and single locus studies have been discussed and criticized repeatedly in the literature (Hickerson, Meyer, & Moritz, 2006; Moritz & Cicero, 2004; Will, Mishler, & Wheeler, 2005), there is a continuing utility in sequencing rapidly evolving and/or highly discriminatory loci, at the least to help generate new evolutionary and taxonomic hypotheses (DeSalle & Goldstein, 2019; Honeycutt, 2021). Finally, it is simply relatively easy to amplify and sequence mtDNA and cpDNA in a standard molecular laboratory due to high copy numbers and the availability of primers, making the work feasible for faculty even at moderately funded institutions.
Although the benefits of mtDNA and cpDNA are well-known and generally appreciated, a major limitation of these markers is that they are composed of linked genes that seldom undergo recombination. From an analytical perspective, this essentially means that the entire plastid or mitochondrial genome should be treated as a single locus, in turn, providing a very narrow view on the evolutionary history of the group. In short, a mtDNA or cpDNA gene tree may or may not be congruent with the species tree due to processes such as the stochasticity of the coalescent process (i.e. incomplete lineage sorting; ILS), introgression, horizontal gene transfer, and gene duplication or loss. Thus, in many cases organellar sequences will not be a reliable indicator of taxonomy, and some nuclear loci will be required. The strength of NGS and methods like ddRADseq and UCEs is that they provide thousands of essentially independent coalescent histories with which researchers can estimate important evolutionary parameters such as species limits, species trees, divergences times, patterns and rates of gene flow, selection, and population size changes (Andrews et al., 2016; Baca et al., 2017; Blair et al., 2019; Bryson et al., 2017; Davey & Blaxter, 2010; Finger et al., 2022).
Because of the perpetual decrease in DNA sequencing costs, along with the realization of the limitations of single locus studies, more contemporary publications are relying solely on genomic data for evolutionary inference (Crawford et al., 2012; Dam et al., 2017; D. A. Eaton & Ree, 2013; Hou et al., 2015; Villaverde et al., 2021; Wagner et al., 2013), although exceptions do exist (Finger et al., 2022; Kato et al., 2020; Zarza et al., 2018). I will argue that this is not an optimal use of resources, given the potential benefits of organellar markers alluded to above. Another benefit that has not been properly discussed considering genomics is how these markers can be used to help verify taxonomic identities prior to downstream analysis. This is particularly relevant given the widespread nature of cryptic species in many taxonomic groups. When working with mtDNA and cpDNA, an optimal approach following quality trimming and assembly is to BLAST each sequence to verify both the correct gene and taxonomic identity. This is made possible by the decades of sequence data that have been deposited in databases such as GenBank and the Barcode of Life Data System (BOLD; http://v4.boldsystems.org/; Ratnasingham & Hebert, 2007). Note that this approach is currently impossible with reduced representation genomic data from non-model species. Instead, researchers must assume that they sequenced what they thought they were sequencing. Anomalous samples can possibly be identified in subsequent analyses (e.g. phylogenetic and/or clustering analysis), but this approach will not reveal the true identity of the individuals. Moreover, an anomalous, highly divergent sequence might erroneously suggest an unknown, cryptic species. This would lead to the researcher spending additional time and resources trying to decipher the true history and taxonomy of the sample. In a worst-case scenario, one might describe the taxon as a new species simply based on results of the molecular analysis. It should be noted, however, that databases like GenBank can and do contain errors (Ashelford et al., 2005; Bridge et al., 2003; Meiklejohn et al., 2019; Mulcahy et al., 2022), and processes such as mtDNA introgression and ILS could lead to incorrect species identities. In addition, taxonomic biases exist in these databases. As of January 2023, BOLD contained sequences for 246k species of animals, 72k species of plants, and only 24k species of fungi and “other species” such as protists.
Obtaining samples for phylogeographic and population genomic studies can be challenging due to funding limitations, logistics, geography, lack of personnel, and time constraints, particularly for researchers at underfunded and primarily undergraduate institutions. Thus, it is common for researchers to reach out to natural history museums to inquire about tissue loans to help facilitate investigations. These museums represent a vitally important record of life on Earth that have enabled ecological and evolutionary research for centuries. Unfortunately, taxonomic identifications made by collectors in the field are not always accurate, and museum databases may contain errors and/or outdated information (Boessenkool et al., 2010; Mulcahy et al., 2022; Scott et al., 2019). This is the case even with relatively well-known groups distributed throughout North America. Given that most loans for molecular studies are tissues only, researchers have little information available to verify taxonomic identity. By sequencing and analyzing common mtDNA, cpDNA and nuclear genes in addition to genomic data, researchers can verify the correct species ID of each sample through a simple BLAST or BOLD search (https://ibol.org/, http://v4.boldsystems.org/). Note that the sequenced genes do not have to be limited to the barcoding markers, as sequences for many other genes are widely available across the Tree of Life (e.g. the mitochondrial ND4 gene has been sequenced in squamate reptiles for decades; Arevalo et al., 1994).
Although amplifying and Sanger sequencing an additional gene (or pair of genes for plants if following barcoding recommendations) adds to the overall project cost and time commitment, I argue that the cost is well worth the potential gain. Depending on the available infrastructure, amplifying and sequencing a single mtDNA gene for a plate (96 samples) should add ~$1000-2000 to the overall cost of the project. Even though the cost of Sanger sequencing is relatively high on a per base basis, the amount of additional information it can provide is vast, and can reassure researchers that they are working with the species of interest. These data can also be used to investigate deep coalescence, introgression, sex-biased dispersal, patterns of selection, and putative species limits. I will note that in many cases (particularly in animals), it will not be necessary to sequence multiple genes, as the additional benefit likely declines substantially after the first locus. This is due, in part, to the linked nature of organellar genomes as discussed above. Additional genes can potentially provide more phylogenetically informative characters, but each gene will inherently reflect the same history. Finally, the relatively simple process of amplifying and sequencing a single informative gene can help fix errors in museum databases through a quick correspondence with curators (Mulcahy et al., 2022).
Although Sanger sequencing a single organellar locus for up to a plate may be justified in some instances, the total project cost can quickly become prohibitive if additional loci and individuals are needed. Based on current price estimates (January 2023), the total cost of amplifying and sequencing a second cytoplasmic or nuclear locus using Sanger technology (96 samples, excluding personnel costs) would likely meet or surpass the cost of a GBS/ddRADseq protocol offered by commercial labs (e.g. https://biotech.wisc.edu/). Thus, there has been recent interest in mining genomic data sets for organellar loci at no additional sequencing cost. Organellar markers are often sequenced in genomics studies as “by-catch” with certain library preparation techniques (Allio et al., 2020). Whole genome sequencing will inherently capture the organellar genome in addition to nuclear sequences, and researchers should make full use of these data by assembling and annotating these genomes. Perhaps more surprisingly, targeted sequence capture methods like UCEs and AHEs also tend to return organellar sequences, which should be fully utilized in subsequent evolutionary analyses (Amaral et al., 2015; Caparroz et al., 2018; Derkarabetian et al., 2019; Lemmon, 2012; Lyra et al., 2017; Miller et al., 2022; Simon et al., 2019; Zarza et al., 2018). Finally, several RADseq/GBS studies have isolated and analyzed organellar markers to provide a more comprehensive perspective on evolutionary history (Du et al., 2020; Meger et al., 2019; Stobie et al., 2019). Some of the traditionally used bioinformatics packages such as Stacks 2 (Rochette et al., 2019) and ipyrad (D. A. R. Eaton & Overcast, 2020) can be used to isolate mtDNA and cpDNA sequences from raw reads, while newer tools are beginning to emerge that are designed specifically for this task (Laczkó et al., 2022). However, obtaining organellar loci from RADseq libraries poses numerous challenges. For example, the number and types of loci recovered will be inherently based on characteristics of the organellar genomes under study and the restriction enzymes used (Laczkó et al., 2022). Second, allelic dropout due to mutations in restriction enzyme cut sites can result in a highly fragmented data set and possibly lead to biased diversity estimates (Gautier et al., 2013) . This will be particularly problematic when working with animal mtDNA genomes given the fast rate of molecular evolution in many groups, resulting in some samples with little data to help verify identity. Third, the loci recovered may have limited taxonomic utility due to sequence conservation and/or the absence of homologous sequences in public repositories. Thus, in theory there may be little need to use Sanger-based technology to obtain organellar sequences in modern genomic studies, although more work is needed that focuses on optimal library preparation techniques and bioinformatics protocols. Note that there are currently several tools available to help researchers plan and design RADseq studies that can help target organellar markers (Chafin et al., 2018; Melo & Hale, 2019; Rivera-Colón et al., 2021). Another alternative to minimize cost may be to Sanger sequence a small proportion of anomalous samples identified through phylogenetic and population genetic analyses. An obvious weakness of this approach is that organellar data would not be available for all samples, limiting the additional utility provided by these markers.
One could argue that instead of sequencing organellar loci to help confirm taxonomic identity, low coverage whole genome sequences could be obtained for each individual. Given that sequencing costs will continue to decrease, it is likely that the number of whole genome sequences will continue to rise. This is due, in part, to both advancing sequencing technology and several consortia focused on generating whole genome sequences for either specific taxonomic groups including vertebrates (Koepfli et al., 2015; Rhie et al., 2021), birds (G. Zhang, 2015), arthropods (i5K Consortium, 2013) or more broadly for all eukaryotic life on Earth (i.e. the Earth BioGenome Project; Lewin et al., 2018). If the ambitious goals of the Earth BioGenome Project (EBP) come to fruition, high-quality reference genomes will be available for all species of eukaryotes. This will no doubt serve as a tremendous resource to a diverse group of researchers in ecology, evolution, conservation, epidemiology, agriculture, and medicine. Unfortunately, as of June 2021, only 0.2% of the 1.6 million animal species (Z.-Q. Zhang, 2013) had a deposited reference genome (many with relatively low quality), with a strong bias towards chordates (Hotaling et al., 2021). Without an adequate reference database of whole genome sequences and associated metadata, the potential utility of using low coverage whole genomes for taxonomic confirmation is limited. In addition, cost would still be prohibitive when sequencing tens or hundreds of individuals as is common in phylogeographic and population genomic studies. Even though the cost of a vertebrate genome can be <$1000 USD as of 2018 (Lewin et al., 2018), this still adds up to a project cost that few can justify and attain. This is particularly the case for researchers at predominantly undergraduate institutions with limited access to funding, and those at institutions in developing countries that may not have adequate infrastructure (Hotaling et al., 2021). Lower coverage genomes are an option to further reduce cost, at the risk of obtaining unreliable base calls that could complicate taxonomic identification.
Organellar markers (for plants and animals) and a handful of nuclear genes (for fungi) have been sequenced for decades and still serve a valuable purpose for researchers in ecology, evolution, and systematics. These markers can also have tangible conservation implications and help control the spread and resale of threatened and endangered species (Hobbs et al., 2019). The genomic revolution has fundamentally changed the way that science is conducted and hypotheses are generated. However, we should not turn our backs to the extraordinary availability of high-quality mtDNA and cpDNA sequences that are maintained in accessible repositories. These publicly available sequences are a vital resource that can be used to help confirm the taxonomic identity of any sample sequenced using NGS technology. Sequencing and analyzing markers such as mtDNA can also add additional information to help illuminate species limits, divergence times, dispersal patterns, introgression, and selection. Until high quality reference genomes become available for more of Earth’s biodiversity, and the cost of whole genome sequencing decreases to the point where it becomes feasible for a broader proportion of researchers to re-sequence a large number of individuals, sequencing and analyzing mtDNA and cpDNA should remain as a cost efficient and relatively simple approach to remain confident that the newly generated genomic data are from the taxon of interest.
Acknowledgments
CB would like to thank Erika Crispo, Nolan Kane, Loren Rieseberg, and two anonymous reviewers for helpful comments that improved the manuscript, and NSF for their support (DEB-1929679).
Data Accessibility
There are no new data to report in this study.
References
Allio, R., Schomaker-Bastos, A., Romiguier, J., Prosdocimi, F., Nabholz, B., & Delsuc, F. (2020). MitoFinder: Efficient automated large-scale extraction of mitogenomic data in target enrichment phylogenomics.Molecular Ecology Resources , 20 (4), 892–905. https://doi.org/10.1111/1755-0998.13160
Amaral, F. R. do, Neves, L. G., Jr, M. F. R. R., Mobili, F., Miyaki, C. Y., Pellegrino, K. C. M., & Biondo, C. (2015). Ultraconserved Elements Sequencing as a Low-Cost Source of Complete Mitochondrial Genomes and Microsatellite Markers in Non-Model Amniotes. PLOS ONE ,10 (9), e0138446. https://doi.org/10.1371/journal.pone.0138446
Andrews, K. R., Good, J. M., Miller, M. R., Luikart, G., & Hohenlohe, P. A. (2016). Harnessing the power of RADseq for ecological and evolutionary genomics. Nature Reviews Genetics , 17 (2), Article 2. https://doi.org/10.1038/nrg.2015.28
Arevalo, E., Davis, S. K., & Sites, J. W. (1994). Mitochondrial DNA Sequence Divergence and Phylogenetic Relationships among Eight Chromosome Races of the Sceloporus grammicus Complex (Phrynosomatidae) in Central Mexico. Systematic Biology , 43 (3), 387–418. https://doi.org/10.2307/2413675
Ashelford, K. E., Chuzhanova, N. A., Fry, J. C., Jones, A. J., & Weightman, A. J. (2005). At least 1 in 20 16S rRNA sequence records currently held in public repositories is estimated to contain substantial anomalies. Applied and Environmental Microbiology ,71 (12), 7724–7736. https://doi.org/10.1128/AEM.71.12.7724-7736.2005
Avise, J. C. (2000). Phylogeography: The History and Formation of Species . Harvard University Press.
Avise, J. C., J. Arnold, R. M. Ball, E. Bermingham, T. Lamb, J. E. Neigel, C. A. Reeb, and N. C. Saunders. (1987). Intraspecific Phylogeography: The mitochondrial DNA bridge between population genetics and systematics. Annual Review of Ecology and Systematics ,18 , 489–522.
Baca, S. M., Alexander, A., Gustafson, G. T., & Short, A. E. Z. (2017). Ultraconserved elements show utility in phylogenetic inference of Adephaga (Coleoptera) and suggest paraphyly of ‘Hydradephaga.’Systematic Entomology , 42 (4), Article 4. https://doi.org/10.1111/syen.12244
Baird, N. A., Etter, P. D., Atwood, T. S., Currey, M. C., Shiver, A. L., Lewis, Z. A., Selker, E. U., Cresko, W. A., & Johnson, E. A. (2008). Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers.PLOS ONE , 3 (10), Article 10. https://doi.org/10.1371/journal.pone.0003376
Barrowclough, G. F., & Zink, R. M. (2009). Funds enough, and time: MtDNA, nuDNA and the discovery of divergence. Molecular Ecology ,18 (14), Article 14.
Blair. (2009). Molecular phylogenetics and taxonomy of leaf-toed geckos (Phyllodactylidae: Phyllodactylus) inhabiting the peninsula of Baja California. Zootaxa , 2027 , 28–42.
Blair, C., & Bryson, R. W. (2017). Cryptic diversity and discordance in single-locus species delimitation methods within horned lizards (Phrynosomatidae: Phrynosoma). Molecular Ecology Resources ,17 (6), 1168–1182. https://doi.org/10.1111/1755-0998.12658
Blair, C., Bryson, R. W., Linkem, C. W., Lazcano, D., Klicka, J., & McCormack, J. E. (2019). Cryptic diversity in the Mexican highlands: Thousands of UCE loci help illuminate phylogenetic relationships, species limits and divergence times of montane rattlesnakes (Viperidae: Crotalus). Molecular Ecology Resources , 19 (2), Article 2. https://doi.org/10.1111/1755-0998.12970
Boessenkool, S., Star, B., Scofield, R. P., Seddon, P. J., & Waters, J. M. (2010). Lost in translation or deliberate falsification? Genetic analyses reveal erroneous museum data for historic penguin specimens.Proceedings of the Royal Society B: Biological Sciences ,277 (1684), 1057–1064. https://doi.org/10.1098/rspb.2009.1837
Bonatelli, I. a. S., Zappi, D. C., Taylor, N. P., & Moraes, E. M. (2013). Usefulness of cpDNA markers for phylogenetic and phylogeographic analyses of closely related cactus species. Genetics and Molecular Research: GMR , 12 (4), 4579–4585. https://doi.org/10.4238/2013.February.28.27
Bridge, P. D., Roberts, P. J., Spooner, B. M., & Panchal, G. (2003). On the unreliability of published DNA sequences. The New Phytologist , 160 (1), 43–48. https://doi.org/10.1046/j.1469-8137.2003.00861.x
Bryson, R. W., Linkem, C. W., Pavón-Vázquez, C. J., Nieto-Montes de Oca, A., Klicka, J., & McCormack, J. E. (2017). A phylogenomic perspective on the biogeography of skinks in the Plestiodon brevirostris group inferred from target enrichment of ultraconserved elements.Journal of Biogeography , 44 (9), Article 9. https://doi.org/10.1111/jbi.12989
Caparroz, R., Rocha, A. V., Cabanne, G. S., Tubaro, P., Aleixo, A., Lemmon, E. M., & Lemmon, A. R. (2018). Mitogenomes of two neotropical bird species and the multiple independent origin of mitochondrial gene orders in Passeriformes. Molecular Biology Reports , 45 (3), 279–285. https://doi.org/10.1007/s11033-018-4160-5
Castoe, T. A., Spencer, C. L., & Parkinson, C. L. (2007). Phylogeographic structure and historical demography of the western diamondback rattlesnake (Crotalus atrox): A perspective on North American desert biogeography. Molecular Phylogenetics and Evolution , 42 (1), Article 1. https://doi.org/10.1016/j.ympev.2006.07.002
Chafin, T. K., Martin, B. T., Mussmann, S. M., Douglas, M. R., & Douglas, M. E. (2018). FRAGMATIC: In silico locus prediction and its utility in optimizing ddRADseq projects. Conservation Genetics Resources , 10 (3), 325–328. https://doi.org/10.1007/s12686-017-0814-1
Çoraman, E., Dundarova, H., Dietz, C., & Mayer, F. (2020). Patterns of mtDNA introgression suggest population replacement in Palaearctic whiskered bat species. Royal Society Open Science , 7 (6), 191805. https://doi.org/10.1098/rsos.191805
Crawford, N. G., Faircloth, B. C., McCormack, J. E., Brumfield, R. T., Winker, K., & Glenn, T. C. (2012). More than 1000 ultraconserved elements provide evidence that turtles are the sister group of archosaurs. Biology Letters , 8 (5), Article 5.
Dam, M. H. V., Lam, A. W., Sagata, K., Gewa, B., Laufa, R., Balke, M., Faircloth, B. C., & Riedel, A. (2017). Ultraconserved elements (UCEs) resolve the phylogeny of Australasian smurf-weevils. PLOS ONE ,12 (11), Article 11. https://doi.org/10.1371/journal.pone.0188044
Davey, J. W., & Blaxter, M. L. (2010). RADSeq: Next-generation population genetics. Briefings in Functional Genomics ,9 (5–6), Article 5–6.
Derkarabetian, S., Benavides, L. R., & Giribet, G. (2019). Sequence capture phylogenomics of historical ethanol-preserved museum specimens: Unlocking the rest of the vault. Molecular Ecology Resources ,19 (6), 1531–1544. https://doi.org/10.1111/1755-0998.13072
DeSalle, R., & Goldstein, P. (2019). Review and Interpretation of Trends in DNA Barcoding. Frontiers in Ecology and Evolution ,7 . https://www.frontiersin.org/articles/10.3389/fevo.2019.00302
Du, Z.-Y., Harris, A. J., & Xiang, Q.-Y. J. (2020). Phylogenomics, co-evolution of ecological niche and morphology, and historical biogeography of buckeyes, horsechestnuts, and their relatives (Hippocastaneae, Sapindaceae) and the value of RAD-Seq for deep evolutionary inferences back to the Late Cretaceous. Molecular Phylogenetics and Evolution , 145 , 106726. https://doi.org/10.1016/j.ympev.2019.106726
Eaton, D. A. R., & Overcast, I. (2020). ipyrad: Interactive assembly and analysis of RADseq datasets. Bioinformatics , 36 (8), Article 8. https://doi.org/10.1093/bioinformatics/btz966
Eaton, D. A., & Ree, R. H. (2013). Inferring phylogeny and introgression using RADseq data: An example from flowering plants (Pedicularis: Orobanchaceae). Systematic Biology , 62 (5), Article 5.
Edwards. (2009). Looking forwards or looking backwards in avian phylogeography? A comment to Zink and Barraclough 2008. Molecular Ecology , 18 , 2930–2933.
Elshire, R. J., Glaubitz, J. C., Sun, Q., Poland, J. A., Kawamoto, K., Buckler, E. S., & Mitchell, S. E. (2011). A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species.PloS One , 6 (5), Article 5. https://doi.org/10.1371/journal.pone.0019379
Faircloth, B. C., McCormack, J. E., Crawford, N. G., Harvey, M. G., Brumfield, R. T., & Glenn, T. C. (2012). Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales.Systematic Biology , 61 (5), Article 5. https://doi.org/10.1093/sysbio/sys004
Finger, N., Farleigh, K., Bracken, J. T., Leaché, A. D., François, O., Yang, Z., Flouri, T., Charran, T., Jezkova, T., Williams, D. A., & Blair, C. (2022). Genome-Scale Data Reveal Deep Lineage Divergence and a Complex Demographic History in the Texas Horned Lizard (Phrynosoma cornutum) throughout the Southwestern and Central United States.Genome Biology and Evolution , 14 (1), evab260. https://doi.org/10.1093/gbe/evab260
Fujisawa, T., & Barraclough, T. G. (2013). Delimiting species using single-locus data and the generalized mixed yule coalescent (GMYC) approach: A revised method and evaluation on simulated datasets.Systematic Biology , syt033.
Gautier, M., Gharbi, K., Cezard, T., Foucaud, J., Kerdelhué, C., Pudlo, P., Cornuet, J.-M., & Estoup, A. (2013). The effect of RAD allele dropout on the estimation of genetic variation within and between populations. Molecular Ecology , 22 (11), 3165–3178. https://doi.org/10.1111/mec.12089
Harrington, S. M., Hollingsworth, B. D., Higham, T. E., & Reeder, T. W. (2018). Pleistocene climatic fluctuations drive isolation and secondary contact in the red diamond rattlesnake (Crotalus ruber) in Baja California. Journal of Biogeography , 45 (1), Article 1. https://doi.org/10.1111/jbi.13114
Hebert. (2004). Ten species in one: DNA barcoding revealscryptic species in the neotropical skipper butterfly Astraptes fulgerator. PNAS ,101 (41), Article 41.
Hebert, P. D., Cywinska, A., & Ball, S. L. (2003). Biological identifications through DNA barcodes. Proceedings of the Royal Society of London. Series B: Biological Sciences , 270 (1512), Article 1512.
Hebert, P. D. N., & Gregory, T. R. (2005). The Promise of DNA Barcoding for Taxonomy. Systematic Biology , 54 (5), Article 5. https://doi.org/10.1080/10635150500354886
Hickerson, M., Carstens, B., Cavender-Bares, J., Crandall, K., Graham, C., Johnson, J., Rissler, L., Victoriano, P., & Yoder, A. (2010). Phylogeography’s past, present, and future: 10 years after.Molecular Phylogenetics and Evolution , 54 (1), Article 1.
Hickerson, M. J., Meyer, C. P., & Moritz, C. (2006). DNA barcoding will often fail to discover new animal species over broad parameter space.Systematic Biology , 55 (5), 729–739. https://doi.org/10.1080/10635150600969898
Hobbs, C. A. D., Potts, R. W. A., Bjerregaard Walsh, M., Usher, J., & Griffiths, A. M. (2019). Using DNA Barcoding to Investigate Patterns of Species Utilisation in UK Shark Products Reveals Threatened Species on Sale. Scientific Reports , 9 (1), Article 1. https://doi.org/10.1038/s41598-018-38270-3
Honeycutt, R. L. (2021). Editorial: DNA Barcodes: Controversies, Mechanisms, and Future Applications. Frontiers in Ecology and Evolution , 9 . https://www.frontiersin.org/articles/10.3389/fevo.2021.718865
Hotaling, S., Kelley, J. L., & Frandsen, P. B. (2021). Toward a genome sequence for every animal: Where are we now? Proceedings of the National Academy of Sciences , 118 (52), e2109019118. https://doi.org/10.1073/pnas.2109019118
Hou, Y., Nowak, M. D., Mirré, V., Bjorå, C. S., Brochmann, C., & Popp, M. (2015). Thousands of RAD-seq Loci Fully Resolve the Phylogeny of the Highly Disjunct Arctic-Alpine Genus Diapensia (Diapensiaceae).PLoS ONE , 10 (10), e0140175. https://doi.org/10.1371/journal.pone.0140175
i5K Consortium. (2013). The i5K Initiative: Advancing arthropod genomics for knowledge, human health, agriculture, and the environment. The Journal of Heredity , 104 (5), 595–600. https://doi.org/10.1093/jhered/est050
Kapli, P., Lutteropp, S., Zhang, J., Kobert, K., Pavlidis, P., Stamatakis, A., & Flouri, T. (2017). Multi-rate Poisson tree processes for single-locus species delimitation under maximum likelihood and Markov chain Monte Carlo. Bioinformatics (Oxford, England) ,33 (11), Article 11. https://doi.org/10.1093/bioinformatics/btx025
Kato, D., Suzuki, H., Tsuruta, A., Maeda, J., Hayashi, Y., Arima, K., Ito, Y., & Nagano, Y. (2020). Evaluation of the population structure and phylogeography of the Japanese Genji firefly, Luciola cruciata, at the nuclear DNA level using RAD-Seq analysis. Scientific Reports ,10 , 1533. https://doi.org/10.1038/s41598-020-58324-9
Koepfli, K.-P., Paten, B., Genome 10K Community of Scientists, & O’Brien, S. J. (2015). The Genome 10K Project: A way forward.Annual Review of Animal Biosciences , 3 , 57–111. https://doi.org/10.1146/annurev-animal-090414-014900
Laczkó, L., Jordán, S., & Sramkó, G. (2022). The RadOrgMiner pipeline: Automated genotyping of organellar loci from RADseq data. Methods in Ecology and Evolution , 13 (9), 1962–1975. https://doi.org/10.1111/2041-210X.13937
Leaché, A. D., & Cole, C. J. (2007). Hybridization between multiple fence lizard lineages in an ecotone: Locally discordant variation in mitochondrial DNA, chromosomes, and morphology. Molecular Ecology , 16 (5), 1035–1054. https://doi.org/10.1111/j.1365-294X.2006.03194.x
Leaché, A. D., & McGuire, J. A. (2006). Phylogenetic relationships of horned lizards (Phrynosoma) based on nuclear and mitochondrial data: Evidence for a misleading mitochondrial gene tree. Molecular Phylogenetics and Evolution , 39 (3), Article 3. https://doi.org/10.1016/j.ympev.2005.12.016
Lemmon. (2012). Ancored hybrid enrichment for massively high-throughput phylogenomics. Systematic Biology , 1–18.
Lewin, H. A., Robinson, G. E., Kress, W. J., Baker, W. J., Coddington, J., Crandall, K. A., Durbin, R., Edwards, S. V., Forest, F., Gilbert, M. T. P., Goldstein, M. M., Grigoriev, I. V., Hackett, K. J., Haussler, D., Jarvis, E. D., Johnson, W. E., Patrinos, A., Richards, S., Castilla-Rubio, J. C., … Zhang, G. (2018). Earth BioGenome Project: Sequencing life for the future of life. Proceedings of the National Academy of Sciences of the United States of America ,115 (17), 4325–4333. https://doi.org/10.1073/pnas.1720115115
Lindell, J., Méndez-de la Cruz, F. R., & Murphy, R. W. (2005). Deep genealogical history without population differentiation: Discordance between mtDNA and allozyme divergence in the zebra-tailed lizard (Callisaurus draconoides). Molecular Phylogenetics and Evolution ,36 (3), Article 3. https://doi.org/10.1016/j.ympev.2005.04.031
Lindell, J., Méndez-De La Cruz, F. R., & Murphy, R. W. (2008). Deep biogeographical history and cytonuclear discordance in the black-tailed brush lizard (Urosaurus nigricaudus) of Baja California.Biological Journal of the Linnean Society , 94 (1), Article 1. https://doi.org/10.1111/j.1095-8312.2008.00976.x
Lindell, J., Ngo, A., & Murphy, R. W. (2006). Deep genealogies and the mid-peninsular seaway of Baja California. Journal of Biogeography , 33 (8), Article 8. https://doi.org/10.1111/j.1365-2699.2006.01532.x
Lyra, M. L., Joger, U., Schulte, U., Slimani, T., El Mouden, E. H., Bouazza, A., Künzel, S., Lemmon, A. R., Lemmon, E. M., & Vences, M. (2017). The mitochondrial genomes of Atlas Geckos (Quedenfeldtia): Mitogenome assembly from transcriptomes and anchored hybrid enrichment datasets. Mitochondrial DNA. Part B, Resources , 2 (1), 356–358. https://doi.org/10.1080/23802359.2017.1339212
Mastrantonio, V., Porretta, D., Urbanelli, S., Crasta, G., & Nascetti, G. (2016). Dynamics of mtDNA introgression during species range expansion: Insights from an experimental longitudinal study.Scientific Reports , 6 (1), Article 1. https://doi.org/10.1038/srep30355
Meger, J., Ulaszewski, B., Vendramin, G. G., & Burczyk, J. (2019). Using reduced representation libraries sequencing methods to identify cpDNA polymorphisms in European beech (Fagus sylvatica L). Tree Genetics & Genomes , 15 (1), 7. https://doi.org/10.1007/s11295-018-1313-6
Meiklejohn, K. A., Damaso, N., & Robertson, J. M. (2019). Assessment of BOLD and GenBank – Their accuracy and reliability for the identification of biological materials. PLOS ONE , 14 (6), e0217084. https://doi.org/10.1371/journal.pone.0217084
Melo, A. T. O., & Hale, I. (2019). Expanded functionality, increased accuracy, and enhanced speed in the de novo genotyping-by-sequencing pipeline GBS-SNP-CROP. Bioinformatics (Oxford, England) ,35 (17), 3215. https://doi.org/10.1093/bioinformatics/bty1073
Miller, C. D., Forthman, M., Miller, C. W., & Kimball, R. T. (2022). Extracting ‘legacy loci’ from an invertebrate sequence capture data set.Zoologica Scripta , 51 (1), 14–31. https://doi.org/10.1111/zsc.12513
Moritz, C., & Cicero, C. (2004). DNA Barcoding: Promise and Pitfalls.PLoS Biology , 2 (10), e354. https://doi.org/10.1371/journal.pbio.0020354
Mulcahy, D. G., Ibáñez, R., Jaramillo, C. A., Crawford, A. J., Ray, J. M., Gotte, S. W., Jacobs, J. F., Wynn, A. H., Gonzalez-Porter, G. P., McDiarmid, R. W., Crombie, R. I., Zug, G. R., & Queiroz, K. de. (2022). DNA barcoding of the National Museum of Natural History reptile tissue holdings raises concerns about the use of natural history collections and the responsibilities of scientists in the molecular age. PLOS ONE , 17 (3), e0264930. https://doi.org/10.1371/journal.pone.0264930
Myers, E. A., Hickerson, M. J., & Burbrink, F. T. (2017). Asynchronous diversification of snakes in the North American warm deserts.Journal of Biogeography , 44 (2), Article 2. https://doi.org/10.1111/jbi.12873
Peterson, B. K., Weber, J. N., Kay, E. H., Fisher, H. S., & Hoekstra, H. E. (2012). Double digest RADseq: An inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PloS One , 7 (5), Article 5.
Ratnasingham, S., & Hebert, P. D. N. (2007). bold: The Barcode of Life Data System (http://www.barcodinglife.org). Molecular Ecology Notes , 7 (3), 355–364. https://doi.org/10.1111/j.1471-8286.2007.01678.x
Rhie, A., McCarthy, S. A., Fedrigo, O., Damas, J., Formenti, G., Koren, S., Uliano-Silva, M., Chow, W., Fungtammasan, A., Kim, J., Lee, C., Ko, B. J., Chaisson, M., Gedman, G. L., Cantin, L. J., Thibaud-Nissen, F., Haggerty, L., Bista, I., Smith, M., … Jarvis, E. D. (2021). Towards complete and error-free genome assemblies of all vertebrate species. Nature , 592 (7856), Article 7856. https://doi.org/10.1038/s41586-021-03451-0
Rivera-Colón, A. G., Rochette, N. C., & Catchen, J. M. (2021). Simulation with RADinitio improves RADseq experimental design and sheds light on sources of missing data. Molecular Ecology Resources ,21 (2), 363–378. https://doi.org/10.1111/1755-0998.13163
Rochette, N. C., Rivera-Colón, A. G., & Catchen, J. M. (2019). Stacks 2: Analytical methods for paired-end sequencing improve RADseq-based population genomics. Molecular Ecology , 28 (21), 4737–4754. https://doi.org/10.1111/mec.15253
Schield, D. R., Card, D. C., Adams, R. H., Jezkova, T., Reyes-Velasco, J., Proctor, F. N., Spencer, C. L., Herrmann, H.-W., Mackessy, S. P., & Castoe, T. A. (2015). Incipient speciation with biased gene flow between two lineages of the Western Diamondback Rattlesnake (Crotalus atrox).Molecular Phylogenetics and Evolution , 83 , 213–223. https://doi.org/10.1016/j.ympev.2014.12.006
Scott, B., Baker, E., Woodburn, M., Vincent, S., Hardy, H., & Smith, V. S. (2019). The Natural History Museum Data Portal. Database ,2019 , baz038. https://doi.org/10.1093/database/baz038
Simon, C., Gordon, E. R. L., Moulds, M. S., Cole, J. A., Haji, D., Lemmon, A. R., Lemmon, E. M., Kortyna, M., Nazario, K., Wade, E. J., Meister, R. C., Goemans, G., Chiswell, S. M., Pessacq, P., Veloso, C., McCutcheon, J. P., & Łukasik, P. (2019). Off-target capture data, endosymbiont genes and morphology reveal a relict lineage that is sister to all other singing cicadas. Biological Journal of the Linnean Society , 128 (4), 865–886. https://doi.org/10.1093/biolinnean/blz120
Soltis, D. E., Gitzendanner, M. A., Strenge, D. D., & Soltis, P. S. (1997). Chloroplast DNA intraspecific phylogeography of plants from the Pacific Northwest of North America. Plant Systematics and Evolution , 206 (1), 353–373. https://doi.org/10.1007/BF00987957
Stobie, C. S., Cunningham, M. J., Oosthuizen, C. J., & Bloomer, P. (2019). Finding stories in noise: Mitochondrial portraits from RAD data.Molecular Ecology Resources , 19 (1), 191–205. https://doi.org/10.1111/1755-0998.12953
Talavera, G., Dincă, V., & Vila, R. (2013). Factors affecting species delimitations with the GMYC model: Insights from a butterfly survey.Methods in Ecology and Evolution , 4 (12), Article 12. https://doi.org/10.1111/2041-210X.12107
Upton, D. E., & Murphy, R. W. (1997). Phylogeny of the side-blotched lizards (Phrynosomatidae:Uta) based on mtDNA sequences: Support for midpeninsular seaway in Baja California. Molecular Phylogenetics and Evolution , 8 (1), Article 1.
Villaverde, T., Maguilla, E., Luceño, M., & Hipp, A. L. (2021). Assessing the sensitivity of divergence time estimates to locus sampling, calibration points, and model priors in a RAD-seq phylogeny of Carex section Schoenoxiphium. Journal of Systematics and Evolution , 59 (4), 687–697. https://doi.org/10.1111/jse.12724
Vitelli, M., Vessella, F., Cardoni, S., Pollegioni, P., Denk, T., Grimm, G. W., & Simeone, M. C. (2016). Phylogeographic structuring of plastome diversity in Mediterranean oaks (Quercus Group Ilex, Fagaceae).Tree Genetics & Genomes , 13 (1), 3. https://doi.org/10.1007/s11295-016-1086-8
Wagner, C. E., Keller, I., Wittwer, S., Selz, O. M., Mwaiko, S., Greuter, L., Sivasundar, A., & Seehausen, O. (2013). Genome‐wide RAD sequence data provide unprecedented resolution of species boundaries and relationships in the Lake Victoria cichlid adaptive radiation.Molecular Ecology , 22 (3), Article 3.
Will, K. W., Mishler, B. D., & Wheeler, Q. D. (2005). The Perils of DNA Barcoding and the Need for Integrative Taxonomy. Systematic Biology , 54 (5), 844–851. https://doi.org/10.1080/10635150500354878
Yan, M., Xiong, Y., Liu, R., Deng, M., & Song, J. (2018). The Application and Limitation of Universal Chloroplast Markers in Discriminating East Asian Evergreen Oaks. Frontiers in Plant Science , 9 . https://www.frontiersin.org/articles/10.3389/fpls.2018.00569
Zarza, E., Connors, E. M., Maley, J. M., Tsai, W. L. E., Heimes, P., Kaplan, M., & McCormack, J. E. (2018). Combining ultraconserved elements and mtDNA data to uncover lineage diversity in a Mexican highland frog (Sarcohyla; Hylidae). PeerJ , 6 , e6045. https://doi.org/10.7717/peerj.6045
Zhang, G. (2015). Bird sequencing project takes off. Nature ,522 (7554), Article 7554. https://doi.org/10.1038/522034d
Zhang, Z.-Q. (2013). Animal biodiversity: An outline of higher-level classification and survey of taxonomic richness (Addenda 2013).Zootaxa , 3703 , 1–82. https://doi.org/10.11646/zootaxa.3703.1.1
Zink, R. M., & Barrowclough, G. F. (2008). Mitochondrial DNA under siege in avian phylogeography. Molecular Ecology , 17 (9), 2107–2121. https://doi.org/10.1111/j.1365-294X.2008.03737.x