Discussion
Diatoms are important organisms to understand aquatic ecosystem
functioning since they play an important position as producers (Rimet et
al., 2018). Unfortunately, our knowledge of diatom biodiversity is still
limited given the great number of estimated extant species (Mann and
Vanormelingen, 2013). However, DNA metabarcoding provides a powerful
tool to examine unknown diatom diversity and expand our knowledge about
their distribution patterns.
Contrary to our expectations (Hyphothesis 1 ), we observed a
relatively poor correspondence between both morphology-based and
molecular-based approaches. We found, however, that both methods
provided similar information when it comes to the underlying processes
determining geographical variation in diatom communities, thereby
supporting our second hypothesis (Hypothesis 2 ). There are
several potential explanations for the relatively low congruence found
between the two approaches:
- Choice of DNA marker. The rbc L gene is a common marker used for
metabarcoding and phylogenetic studies (Keck, Vasselon, Rimet,
Bouchez, & Kahlert, 2018). This gene coding for a protein, so
alignment is simple, insertions or deletions are extremely rare and
compared with ribosomal or mitochondrial markers, the likelihood of
amplifying non-specific products is reduced (Soltis and Soltis, 1998;
Evans, Wortle, & Mann, 2007). Moreover, the rbc L gene seems to
separate taxa better than 18S rDNA gene at species level (Kermarrec et
al., 2014). However, rbc L marker gene does not work for species
lacking a functional plastid (obligatory heterotrophs) such asNitzschia alba (Kowalska et al., 2019). In addition, the short
length 312-bp rbc L barcode gene is readily PCR amplifiable,
which makes the analysis easier. However, the using of a short
sequence (<500 bp) for barcoding may constrain the taxonomic
and phylogenetic assignment, as the information content in the
sequence is limited (Medlin, 2018; Tedersoo, Tooming-Klunderud &
Anslan, 2018). In this regard, Keck et al. (2018) compared the
placement accuracy of the 312-bp gene fragment with the full-length of
the rbc L gene in their phylogeny, and observed that
approximately the 45% of the species were placed exactly at
full-length gene. Similarly, our phylogeny constructed with 708
reference sequences and 3138 taxonomy-assigned OTUs shown that several
reference sequences were placed far of their corresponding
taxonomy-assigned OTUs. The correct placement of sequences on a
phylogeny depends on several factors as the choice of marker gene, the
length of amplicons and the presence of closely related taxa in the
reference phylogeny (Keck et al. (2018). Tedersoo et al. (2018) have
highlighted the importance of length sequence in metabarcoding,
emphasizing that longer amplicons increase the accuracy of
identification at the species level. We speculate that using a
combination of two or more DNA barcode regions or others marker genes
(e. g. Second Internal Transcriber Spacer) could be more suitable for
unambiguous species identification, especially to distinguish closely
related species (Moniz and Kaczmarska, 2009).
- The PCR reaction used to amplify the barcode region can be inhibited
by contaminants and produce chimeric DNA molecules (Hugerth and
Andersson, 2017). Moreover, several organisms may be underestimated if
their DNA template does not hybridize with the designed primers.
- On the other hand, the completeness of the reference database is a key
factor that strongly limits the taxonomy assignment of OTUs. In this
vein, a large number of diatom taxa morphologically identified (31
species and 8 genera) could not be detected by metabarcoding approach
due to the lack of reference sequences in the R-Syst::diatom database.
Thus, some species detected only by light microscopy (e. g.Cocconeis euglypta, C. pediculus, Stauroneis producta and S.
gracilis) differed from those detected by metabarcoding approach
(C. cupulifera, C. mascarenica, S. anceps and S. gracilior ).
Following Jahn, Zetzsche, Reinhardt, & Gemeinholzer (2007), we
further hypothesize that taxa with sequences absent in the reference
database could be compensated by taxa of the same genus that have
sequences available in the reference database or by a taxon not
expected in the studied ponds. This hypothesis could explain the
relatively minor discrepancies observed between both inventories at
genus level resolution.
- The bioinformatics processing might have also played an important role
in the discrepancies observed between morphological and molecular
inventories. Typically, DNA sequences obtained in a high-throughput
sequencing run are filtered and clustered, based on a distance matrix
at a specified threshold, into Operational Taxonomic Units (OTUs) to
reduce the PCR and sequencing errors and the polymorphism present in
the barcode region (Chen, Zhang, Cheng, Zhang, & Zhao, 2013). The
clustering process is mainly affected by the clustering method and the
threshold value used for sequence similarity (Chen et al., 2013).
Often, sequences are clustered at 97% similarity, however different
taxa could have less distance between their barcodes (Hugerth and
Andersson, 2017). By contrast, using of high sequence similarity
threshold value increase the number of unclassified OTUs and the PCR
and sequencing errors (Tapolczai et al., 2018). Nevertheless, a common
identity threshold for assigns taxonomy to all diatom taxa does not
appear exist yet due to the heterogenous evolution rate of therbc L gene and the speciation process (Kermarrec et al., 2014).
In addition, relation between OTUs and biological species is not
straightforward (Ryberg, 2015; Bálint et al., 2016).
Interestingly, we observed the presence of some marine species in our
molecular inventory, e.g. Thalassiosira profunda andThalassiosira mediterranea (Percopo, Siano, Cerino, Sarno, &
Zinigone, 2011; Hasle 1990). The sequences assigned to such species were
placed far of their respective reference sequences in our phylogeny,
which could reflect an inaccurate taxonomic assignment. However, the
taxonomic assignment at genus level of such sequences could be correct
since thalassiosiroids feature prominently in freshwater ecosystems,
rivaling their freshwater diversity with the marine ones (Alverson,
2014). On the other side, microscopy method has a lower capacity to
detect rare species than metabarcoding (Rimet et al., 2018), whereas
that molecular-based approach allows detecting all species that could be
detected by this method, covering the full range of species richness.
However, we hypothesize that using a higher similarity threshold value
for taxonomic assignment or using simultaneously other marker genes
could be more suitable to assign unequivocally taxonomy to such DNA
sequences.
- On the other hand, several species (cryptic) may be morphologically
identical but have genetic differences (Zimmermann et al., 2015).
Several molecular studies (Mann and Vanormelingen, 2013; An, Choi,
Lee, Lee, & Noh, 2018) have suggested that diatom biodiversity has
been underestimated. For example, in our study we identified
morphologically only 12 infrageneric taxa belonging toNitzschia genus, whereas by metabarcoding approach were
detected 24 taxa. This fact could be related with the cryptic
diversity observed within the morphologically identifiedNitzschia palea species complex (Trobajo et al., 2010).
Likewise, genetically distinct entities have been observed within
morphologically identified species in Cyclotell a,Eunotia , Gomphonema , Hantzschia , Navicula ,Pinnularia and Sellaphora (Rovira et al., 2015). On the
other side, the intraspecific and intragenomic polymorphism present in
the barcode region can overestimate the species richness, since
members of a single taxon possess several genotypes at the barcode
region and may clustered into different OTUs (Mora et al., 2019). In
addition, individuals of the same species from different geographic
populations may possess different barcode sequences (Medlin, 2018).
- Other factors, as the presence of extracellular DNA, can affect the
composition of molecular inventories. Thus, extracellular DNA from
diatom species may be detected in a sample even if their cells are not
physically present, adding extra taxa to the molecular inventory
(Kermarrec et al., 2014; Rimet et al., 2018). Moreover, our
morphological identifications were based on the observation of live
material only. Thereby, some taxa founded in our molecular inventories
(e.g. Attheya septentrionalis) may hardly be identified by
microscopic methods since they are weakly silicified
(Stachura-Suchoples, Enke, Schlie, Schaub, Karsten & Jahn, 2015).
Finally, the high number of synonyms present on diatoms taxonomy may
hinder the comparison of morphological and molecular inventories
(Hillebrand, Watermann, Karez & Berninger, 2001).
In spite of all biases inherent to both morphological and metabarcoding
methods, compositional variation of diatom communities was positively
correlated with the environmental template, thereby emphasizing that
diatom communities were mainly controlled by niche-based mechanisms
(e.g. species sorting) and confirming our second hypothesis
(Hypothesis 2 ). Similar results have been reported by other
studies on diatom communities (Verleyen et al., 2009; Göthe et al.,
2013; Jamoneau, Passy, Soininem, Leboucher, & Tison-Rosebery, 2017), in
which the environmental factors dominated the spatial and biological
processes on structuring benthic algal communities. Moreover, in our
study similar environmental variables (e.g. total suspended solids) were
correlated in both inventories with diatom composition variation, which
could be related with the same sampled substrate (S. lacustris ),
since diatom species may exhibit a tight environmental tolerance and
strong preferences for particular substrata (Soininen, 2007; Cantonati
& Spitale, 2009). Host macrophytes are important elements supplying
nutrients to epiphytic diatoms, especially in oligotrophic and
mesotrophic waters (Letáková, Fránková, & Poulíčková, 2018). In our
study, morphological and molecular inventories were related with
nutrients (e.g. total phosphorus and ammonium), which is expectable
since nutrients (particularly phosphorus) are important for diatoms
primary productivity and growth (Pan, Stevenson, Hill, Herlihy, &
Collins, 1996). Ammonia influence importantly the diatom composition and
may be a limiting nutrient in primary productivity (Natarajan, 1970).
Similarly, fluoride can improve or inhibit the growth of diatoms
depending of its concentration, exposure time and diatom species
(Camargo, 2003). Moreover, conductivity was related with morphological
inventory at species level resolution, which is foreseeable since
diatoms are very sensitive to ionic content and composition, and
consequently, they are often used to monitor conductivity fluctuations
(Potapova and Charles, 2003). Finally, both morphological and molecular
inventories were related with total suspended solids variable, which may
influence diatom assemblages by processes as light decreasing, nutrient
adsorption and algae aggregation (Hoshikawa et al., 2019).
Microorganisms, and particularly diatoms, have historically been
considered to be ubiquitously distributed due their small size and huge
population densities, and their communities mainly controlled by local
environmental factors (Soininen, 2007; Hillebrand et al., 2001).
Nevertheless, this distribution pattern has been challenged by several
studies (Heino et al., 2010; Soininen, 2007; Blanco, Olenici, & Ortega,
2020), suggesting that variation in community structure cannot be
explained by environmental factors alone, and thereby questioning the
strict ubiquitous dispersal of diatom communities. We found no
significant correlation between compositional variation of diatom
assemblages and spatial distance, which may be explained by the
relatively small extent of our study area. The effect of spatial
distance may be more important at large spatial extents, while
environmental factors may be more important at reduced extents (Alahuhta
& Heino, 2013; Declerk et al., 2011). However, in stochastic and highly
heterogeneous systems as temporary ponds, environmental control may not
necessarily be strong (Heino et al., 2015). Hence, other factors not
assessed in our study, such as biotic interactions, may also be
important to structure diatom communities (Göthe et al., 2013).
Nevertheless, we are confident that we included an environmental
template frequently known to influence the composition of diatom
communities (Pan et al., 1996; Potapova and Charles, 2003). Moreover,
the environmental template we included varied extensively across ponds,
thereby leading potential for species sorting.
In summary, our study showed that both molecular and morphological
methods were influenced by several biases inherent in its own
methodology. The main biases related to molecular approach were probably
the incompleteness of the reference database and the bioinformatics
processing, which highlight the need of expand the reference database to
include all genotypes of occurring taxa and the need of reach a
consensus about the bioinformatics processing in order to favor the
comparison between studies. In addition, establishing robust species
identification thresholds and using a combination of two or more DNA
barcode regions could be suitable for unambiguous species
identification, especially in those cases where a single marker gene
shows low variability. On the other side, the limited counting effort of
morphological approach and the presence of cryptic species were
presumably the main biases related with the morphological approach. Our
results showed that both approaches were related with the environmental
template, suggesting that Mediterranean epiphytic diatom communities are
mainly controlled by niche-based mechanisms at regional extents.
However, we have not found a significant correlation between
compositional variation of diatom assemblages and spatial distance,
probably explained by the regional spatial extent studied. In
conclusion, our work shows that both molecular and morphological
approaches provide complementary information on each other and
highlighted the importance of metabarcoding approach to infer the
composition of epiphytic diatom assemblages, especially when
completeness of the reference databases improves and bioinformatics
biases are overcome.