Introduction
Species delimitation has never been a simple or straightforward task,
due, in part, to how species are conceptualized (Mayden, 1997; Wilkins,
2011). However, conceptualizing species as “separately evolving
metapopulation lineages” – the unified species concept (USC) – shifts
the debate away from species concepts to how species boundaries can be
empirically delimited (de Queiroz, 2007). Under the USC, rather than
weighing which property, such as morphology, monophyly, or reproductive
isolation is indicative of a population representing an independent
lineage, these properties simply become lines of evidence that can be
used within an empirical framework for delimiting species (Sites and
Marshall, 2004). Different properties may arise at different times or in
different orders during the process of speciation (de Queiroz, 1998; de
Queiroz, 2005; de Queiroz, 2007). Thus, early in the process of
speciation, there is often a grey area where species boundaries are
difficult to ascertain and potential for conflicting signals among
independent lines of evidence (Carstens et al., 2013; Huang and Knowles,
2015; Mallet, 2008). Hence, one of the biggest challenges of species
delimitation is in determining when differences in intra-specific
population structure becomes species-level divergence.
With closely related or recently diverged species, delimitation of
species boundaries using traditional phenotype-based approaches, e.g.,
morphology, ecological, or chemical characteristics, becomes challenging
(Poelstra et al., 2020; Printzen, 2010; Wagner et al., 2013). The
inclusion of molecular sequence data can be a critical tool for
inferring robust species boundaries, especially within an integrative
framework (Dayrat, 2005; Fujita et al., 2012; Will et al., 2005).
Particularly, high-throughput sequencing technologies allow for
genome-scale data to be generated using a wide range of time- and
cost-effective strategies, generating hundreds to thousands of loci
(Hale et al., 2020; McKain et al., 2018). However, even when using
genome-scale data and empirical species delimitation analyses, the
species boundary can remain blurred, especially in recently diverged
species, hybrid zones, cases of introgression, and when species
boundaries remain semipermeable (Harrison and Larson, 2014; Hey et al.,
2003; Leaché et al., 2018). Furthermore, harnessing genome-scale data
creates computational challenges and divergent approaches for selecting
appropriate genomic loci (Faircloth et al., 2012; Kapli et al., 2020).
While additional loci allows for the detection of subtle differences in
population structure, potentially resolving recent species-level
divergence (Huang, 2020), these data may also lead to over-splitting
populations and inappropriately recognizing populations as species
(Rannala, 2015; Sukumaran and Knowles, 2017).
Similar to other non-model organismal groups, interpreting species-level
diversity in lichen-forming fungi has been revolutionized by
incorporating DNA sequence data (Lumbsch and Leavitt, 2011; Printzen,
2010; Schneider et al., 2016). Interpreting anatomical, chemical, and
morphological characters of lichens within a phylogenetic framework has
transformed our understanding of the role of phenotypic characters used
for lichen-forming fungal taxonomy (Printzen, 2010). While portions of
the ribosomal cistron, including the internal transcribed spacer (Schoch
et al., 2012) – the standard barcoding marker for fungi, and regions of
the mitochondrial genome, have been the standard genetic loci for lichen
fungi systematics, genomic data have become increasingly useful when
studying closely related species or recently diverged populations (Allen
et al., 2018; Alonso-García et al., 2021; Grewe et al., 2017; Leavitt et
al., 2016; Widhelm et al., 2021).
Here we investigate species boundaries in a lineage of fruticose
lichen-forming fungi endemic to coastal fog deserts along the Pacific
coasts of the New World – Niebla Rundel & Bowler in the family
Ramalinaceae. Members of this genus form conspicuous, fruticose lichens,
expressing an impressive range of morphological and chemical diversity
(Fig. 1). This phenotypic variation has resulted in two widely different
interpretations of species-level diversity, either a few,
morphologically polymorphic species (Bowler and Marsh, 2004) or a much
finer subdivision, two genera, Niebla with 42 species and one
variety, and Vermilacinia Spjut & Hale with 28 species (Spjut,
1996). These genera were recently investigated within a multi-locus,
phylogenetic framework (Spjut et al., 2020). Niebla diversified
more recently, estimated at ca. 13 Mya, with speciation largely impacted
by the climatic oscillations since the Miocene (Spjut et al., 2020).
While evidence from Spjut et al. (2020) support high levels of diversity
in Niebla , most phenotypically circumscribed Nieblaspecies were not recovered as monophyletic using a multi-locus dataset.
These results have been explained by model of micro-endemism of
allopatric species in Niebla , with convergent forms occurring in
separate populations, especially different geographic regions within the
coastal fog deserts of California, Baja California, and Baja California
Sur (Spjut et al., 2020 – e.g., N. homalea , N. undulata ,
Fig. 7). A – a Furthermore, despite generating a robust, multi-locus
dataset and implementing multiple DNA-based empirical species
delimitation analyses, species boundaries remain uncertain due to
discrepancies among different analytical approaches (Carstens et al.,
2013; Spjut et al., 2020).
Given the incongruence between what has been considered taxonomically
diagnostic phenotypic characters and results from multi-locus
phylogenetic inference and species delimitation analyses, here we aim to
explore the utility of genome-scale data to resolve species boundaries
in this fascinating lineage of lichen-forming fungi. In addition, we
test the hypothesis of micro-endemism in this clade of symbiotic fungi.
Specifically, we used restriction site-associated DNA sequencing
(RADseq) to generate tens of thousands of short loci distributed across
the genome to infer evolutionary
relationships.
Several species delimitation analyses, including BPP and SNAPP+Bayes
factor delimitation (BFD*), were used to infer species boundaries. BPP
is a Bayesian MCMC program for analyzing genomic sequence data under the
multispecies coalescent model and has been shown to be able to delimit
species boundaries even among closely related lineages (Flouri et al.,
2018; Yang and Rannala, 2010, 2014; Zhang et al., 2011). Given that
genomic data be can effective in delimiting fine-scale population
structure, rather than species, we use the recently heuristic
genealogical divergence index (gdi ) to distinguish between
population structure and species boundaries (Jackson et al., 2017;
Leaché et al., 2018). With BPP’s estimates of coalescent node heights
(τ ) and ancestral effective population size (θ ),gdi scores can be calculated to estimate the divergence of two
taxa (Leaché et al., 2018; Poelstra et al., 2020). Complementarily, we
used SNAPP+BFD* to rank different species delimitation models. SNAPP
attempts to decrease computation time for large amounts of phylogenetic
datasets by bypassing the need for gene trees, and species tree
estimates are directly inferred from biallelic markers, such as SNP data
(Bryant et al., 2012). For each model, Bayes factor scores are
calculated based on their marginal likelihood values using a
path-sampling analysis and subsequently compared (Leaché et al., 2014).
While this comparison is useful in identifying the best-fitting species
delimitation models, it computationally limited to only smaller datasets
with fewer than 10 species and under 100 individual samples (Leaché et
al., 2014).
Our results provide crucial insights into the diversification history ofNiebla and highlight the power and limitations of using
genome-scale data to infer species boundaries in recently diverged
lineages.