4.1 Species delimitation in the “messy middle”
Differences in species delimitation studies, ranging from the number and
type of loci selected, how candidate species are initially identified,
e.g., morphology-based vs. sequenced-based, to empirical analytical
approaches, can result in discrepancies in the inferred species
boundaries (Carstens et al., 2013; Luo et al., 2018). Furthermore,
differences in diversification histories likely require different types
of data and different analytical approaches in order to accurately
delimit species boundaries (Camargo and Sites, 2013; Leavitt et al.,
2015). With an increasing number of species inferred from phylogenomic
datasets, the problem of inferring population structure versus actual
species boundaries becomes acute (Leaché et al., 2018). Therefore, how
researchers select the appropriate number of loci and analytical
approaches becomes increasingly important as analyses continue to
increase in scale. Recognizing that any species delimitation approach
infers hypotheses of species boundaries, rather than providing the
ultimate answer/solution, is imperative (Matute and Sepúlveda, 2019).
Hence, integrating independent lines of evidence with phylogenomic
datasets can provide increasingly robust hypotheses of species
boundaries (Fujita et al., 2012; Yang et al., 2019).
The results of our study highlight the problem of delimiting species with complex, recent phylogeographic histories, particularly in groups such as Niebla. Despite the informed perspective from
genome-scale data and long-term investigations of phenotype-based
circumscriptions, species boundaries in Niebla lichens are far
from being resolved. While differences in the number of delimitated
candidate species in Niebla , depending on the analyzed loci and
species delimitation analysis, resulted in no clear favored species
delimitation model, our analyses supported high species-level diversity
in Niebla . Both BPP species delimitation analyses and BFD*
analyses implemented in SNAPP provided strong evidence for multiple
species-level lineages in ASAP species partitions inferred from the
standard fungal barcoding marker, the ITS region (Figs. 2 & 3; Dryad
file S54). The ‘tip-down’ approach using BPP+gdi scores (Jackson
et al., 2017; Leaché et al., 2018), supported the highest number of
candidate species from our dataset, a 29-species model (Fig. 3). The
BFD* analyses also generally favored the most divisive species
delimitation models (Dryad file S54), although independent runs did not
always converge with the more divisive species models. While the BFD*
provides a direct, quantitative method for comparing species models
(Leaché et al., 2014), with increasing numbers of sampled loci and
candidate species, computational limitations may constrain the utility of
this methodological approach (Leaché et al., 2014). In this study, we
attempted to circumvent these limitations by subdividing the topology in
order to analyze computationally feasible data subsets. Despite those
efforts, convergence among independent runs remained an issue in models
with even a modest number of species (Dryad file S54).
Although the heuristic gdi criterion for species delimitation
provides a metric to guide the distinction of population structure from
species boundaries (Jackson et al., 2017; Leaché et al., 2018), we show
here that locus selection can have a substantial impact on gdi scores. In the empirical scenario of Niebla, selecting limited,
but highly variable, RADseq loci resulted in the lowest gdi scores, specifically in the 10- and 163-marker datasets (Fig. 5). In
contrast, randomly selected RADseq loci, resulted in largely consistent,
higher gdi scores; and these gdi scores were more
consistent with the 500-most variable RADseq loci dataset. These results
suggest that when using short RADseq loci, it is important to maximize
the number of analyzed loci if possible, but that lower number of
randomly sampled loci can result in consistent inferences. The heuristic gdi criterion may also be biased in cases where a population was
established by a few founder individuals and NAand θA may be very small. Because the criterion
depends on the population divergence time relative to population size
(2τ /θA ), the use of gdi may lead
to claims of species status even if the populations are recently
diverged (Leaché et al., 2018). We speculate that this may be the case
for Niebla populations in Baja California. Here we inferred
recent divergence among Niebla lineages, and it would not be
unreasonable to suspect that independent populations were founded by few
individuals. If this is the case, it may be necessary to consider both
the absolute population divergence, as well as the population divergence
relative to population size (Leaché et al., 2018; Yang and Rannala,
2010). Finally, given the large range of indecision using the gdi, it may not be possible to infer with confidence species
boundaries given the nature of the speciation process in these
lichen-forming fungi.