Statistical analyses
Non–metric multidimensional scaling (nMDS) using the Jaccard
coefficient was conducted on both morphological and molecular diatom
data sets (here, we ran separate analyses based on species– and
genus–level resolution data). We used the R package vegan (Oksanen et
al., 2019) to compute the Jaccard dissimilarity matrix and run the nMDS
routines. Since the Jaccard index is sensitive to sample size, and since
metabarcoding data often produce samples of heterogeneous sizes (Ohlmann
et al., 2018), we partitioned the Jaccard dissimilarity matrix into its
true turnover component (Baselga, 2010) with the package betapart
(Baselga et al., 2018). For the sake of comparison, we also ran these
and the following analyses using the spatial turnover component of
β-diversity.
In order to explore the overall degree of correspondence between the
ordinal results obtained for morphological and molecular data sets,
Procrustean rotation analysis and the subsequent PROcrustean
randomization TEST (PROTEST) were applied (Gower, 1971, Jackson, 1995).
PROTEST generates from a permutation–based procedure a
correlation–like statistic referred to as “correlation in a symmetric
Procrustean rotation” or \(m_{1-2}\) and a \(p\) value that measures
the significance of the concordance established by the Procrustean
superimposition approach (Jackson, 1995). In addition, we related the
direction of the movement between the base and the end of the
procrustean arrow and the length of these arrows to the distribution of
the morphology– and molecular–based diatom assemblages to identify the
taxa that contributed most to the dissimilarity between ordinations
(García-Girón, Fernández-Aláez, Fernández-Aláez, & Luis, 2018a;
García-Girón, Fernández-Aláez, & Fernández-Aláez, 2018b).
Each of the resulting dissimilarity matrices (i.e. based on species–
and genus–level resolution data separately, and considering the Jaccard
coefficient and its true turnover component) was used in a number of
statistical tests. First, we used the BIOENV analysis (Clarke and
Ainsworth, 1993) to produce “the best” environmental distance matrix
and identify associations between environmental features and community
composition. In brief, this method is based on standardized
environmental variables and tests all combinations of environmental
features, providing information as to which combination shows the
strongest correlation between compositional variation and environmental
distance matrix. Next, we ran a number of Mantel tests and partial
Mantel tests (Mantel, 1967; Legendre and Legendre, 2012) to examine the
relationships between community dissimilarities and environmental or
spatial distances, i.e. we tested for distance decay in community
similarity along environmental or spatial gradients. The matrices of
spatial distances contained pairwise geographical distances calculated
from latitude and longitude, whereas “the best” subset of standardized
environmental variables from the BIOENV were used to compute the
environmental matrices based on the Euclidean distances between study
sites. Mantel tests and BIOENV analysis were based on Spearman’s rank
correlation coefficient and 999 permutations for obtaining values.
We used Mantel correlograms (Oden and Sokal, 1986) to test if pairwise
dissimilarities were spatially autocorrelated within each distance
class. The distance classes were determined by Sturge’s rule (Legendre
and Legendre, 2012) and values were based on 199 permutation with Holm
correction for multiple testing (Holm, 1979).
Finally, we ran distance–based redundancy analysis (db–RDA) on each
community dissimilarity matrix to examine community–environment
relationships in more detail (Legendre and Anderson, 1999). This method
builds on redundancy analysis, but can be based on any dissimilarity or
distance matrix (Legendre and Legendre, 2012). To retain comparability
among BIOENV, Mantel tests and db–RDA, we used the environmental
variables selected by the BIOENV routine as the predictor variables in
the db–RDAs. Using db–RDA, we tested for the amount of variation
explained (\(R^{2}\)), overall significance of the ordination solutions,
and the marginal significance of each environmental variable included in
the model.
All analyses were performed in R version 3.6.0 (R Development Core Team
2018).