2.5. Microbial Diversity Analysis
Alpha and beta diversities reflect the richness within a sample and
difference in bacterial composition among different sites, respectively
(Morris et al. , 2014) . For alpha and beta diversity metrics, a
rooted phylogenetic tree was generated and alpha rarefaction, as well as
taxonomic classification of full-length sequences, was then performed
using SILVA (Quast et al. , 2013). To standardize the data, only
operational taxonomic units (OTUs) containing seven or more counts in at
least one sample were retained prior to ordination in R using
the Phyloseq package (McMurdie and Holmes, 2013). Approximately 75% of
the original number of taxa across all samples was retained following
the filtering criterion of including OTUs with 7 or more counts.
The total number of OTUs following filtering and standardization was
5,776.
Principal Coordinates Analysis (PCoA) was used as the method of
ordination that captured the most of total variance in the top 2
principal coordinates (mainly weighted UniFrac, a distance metric used
to compare microbial communities), along with unweighted UniFrac for
comparison (Lozupone and Knight, 2005). The UniFrac measure takes the
phylogenetic relationship of species into account and is widely used in
microbial ecology (Lozupone et al. , 2011). It should be noted
that unweighted UniFrac is a qualitative measure of diversity mostly
showing rare taxa, whereas the abundance of taxa is considered in
weighted UniFrac, making it a quantitative measure (Lozupone et
al. , 2007). While all OTUs were used to calculate diversity metrics and
PCoA, microbial communities were investigated at the phylum level to
find potential linkages between anthropogenic activities and the
abundance of specific bacteria within each community.
Classic clustering using the unweighted pair group method with
arithmetic mean (UPGMA), single linkage and Ward’s methods, and K-means
clustering at phylum level was conducted in PAST Package (Hammeret al. , 2001). For the K-means methods, three clusters were
chosen based on some preliminary runs and considering channel samples,
bay samples, and samples affected by the anthropogenic activities. The
significance of the clustering analysis was tested using one-way
multivariate analysis of variance (PERMANOVA). Stations were grouped
based on the output of the clustering analysis and permutations
(N=100,000 due to the high number of variables) were used to assess
significance.