Ecological status separates whole biome and individual domain
compositions
It was not possible to separate the five ecological status groups based
on the composition of the analysed streams on its own (data not shown).
This was expected as the highly diverse and variable composition of the
biome between samples obscures the relationship between ecological
status and biological diversity. This observation is supported by
previous studies exploring the potential of metabarcoding as an
alternative for conventional bioassessments in freshwater streams
(Kuntke et al., 2020). Another study exploring prediction of
anthropogenic activity in rivers also found that the complete observed
diversity could not explain ecosystem quality, and suggested the use of
indicator organisms to specialise a potential model (Li et al., 2018).
To explore the relationship between ecological status and beta diversity
further, a canonical correspondence model (CCA) was generated for the
whole biome, as well as the individual domain data (Figure 2). Beta
diversity analysis using CCA is a well-described and widely applied
method in ecological studies (Braak & Verdonschot, 1995), which makes
the chosen approach in the present study directly compatible with
existing protocols for data analysis. The CCA model, constrained by
ecological status, revealed that the whole biome data (Figure 2a)
achieved the best separation, followed by the Bacteria (Figure
2b), where near complete separation of all five ecological status groups
was achieved. Prokaryotic communities associated with sediments and
surface waters have previously been shown to be sensitive to
environmental changes, and have been suggested as a tool for
biomonitoring of pollution (Li et al., 2018; Mlejnková & Sovová, 2010).
The bacterial communities of freshwater streams may present a relatively
unexplored approach with a high potential for the discovery of new
indicators for bioassessment.
A gradient like overlap between streams of bad and poor, and moderate to
high ecological status was observed for the eukaryotic data, which is in
line with previous studies focusing on metabarcoding of invertebrates
(Elbrecht, Vamos, Meissner, Aroviita, & Leese, 2017; Kuntke et al.,
2020), as well as well-described ecological quality measurement
protocols, which are based in the identification and abundance of chosen
indicator species (Birk et al., 2012). The observed archaeal community
was not able to separate the samples based on ecological quality in a
meaningful way, however, this is likely related to the low presence and
lack of differentiation to the surrounding environment and/or the
coverage of the chosen primer set which might only capture a part of the
archaeal taxa. It has previously been shown that archaeal communities in
sediments are highly diverse, as well as sensitive to environmental
change (Hoshino & Inagaki, 2019), and may be worth investigating in
more details in relation to biomonitoring protocols of freshwater
systems.
The domain-specific diversity analysis could potentially be extended
with network analysis to reveal potential ecologically meaningful
relationships within and across domains, which could strengthen the
detection of indicator species and organisms associated to individual
ecological status classes. A similar approach has previously been
applied in paddy soils (Wang et al., 2017). Alternatively, indicator
organisms could be extracted from the dataset to simplify the
dimensionality of metabarcoding data and provide basis for a model
describing the relationship between the biome and ecological status.
This approach has previously been applied in rivers in China (Li et al.,
2018). The data gathered in the present study is promising for this type
of approach, however, due to the low number of samples with bad to
moderate ecological status, the statistical strength of a predictive
model generated based on the current dataset would be relatively low.
Additional sampling to increase sampling size, especially in the lower
quality ecosystems, would strengthen the data and enable the development
of a predictive model based on biome composition data.