16S amplicon sequencing of gut microbiomes
All extracted microbiome DNA samples were quantified with a Nanodrop
ND-1000 spectrophotometer (range 4.2-474.9 ng/µl). All samples were then
sent to the QB3 Vincent J. Coates Genomics Sequencing Laboratory at the
University of California, Berkeley for automated library preparation and
sequencing of 16S rRNA amplicons using an Illumina Mi-Seq v3 (600
cycle). As part of the QB3 library preparation, the Forward ITS1 (ITS1f)
– CTTGGTCATTTAGAGGAAGTAA and Reverse ITS1 (ITS2) –
GCTGGGTTCTTCATCGATGC primers (Smith and Peay, 2014) were used for DNA
metabarcoding markers for fungi (Smith and Peay, 2014). QB3 also used
the following 16S rRNA primers for amplification of prokaryotes (archaea
and bacteria): Forward 16S v4 (515Fb) – GTGYCAGCMGCCGCGGTAA, and
Reverse 16S v4 (806Rb) – GGACTACNVGGGTWTCTAAT (Caporaso et al., 2011;
Apprill et al. 2015).
Bioinformatic Analysis /Quantification and Microbial Ecology
Assessment of Samples
All 16S rRNA amplicon sequences were processed through QIIME 2.0 (Bolyen
et al. 2018) to identify microbe species and estimate abundances.
Sequences from all 78 microbiome preps were imported into QIIME v.
2019.10.0. We determined there were no differences between proximal and
distal regions of the gut for the San Salvador Island individuals,
therefore we concatenated the Crescent Pond and Osprey Lake samples into
one file, in which we had 48 samples which included experimental
controls and quality controls from the QB3 facility (Table S2). There
was no difference between the means of microbe counts in the foregut and
the hindgut (paired t-test, P = 0.29).
We used DADA2 (Callahan et al. 2016) for modeling and correcting
Illumina-sequenced amplicon errors, removing chimeras, trimming low
quality bases, and merging of forward and reverse reads using the
following parameters: –p-trunc-len-f 270 –p-trunc-len-r 210. We used
the QIIME alignment mafft software to align sequencesalignment mask to filter non-conserved and highly gapped columns
from the aligned 16S sequences (Stackebrandt and Goodfellow, 1991).
Next, we used qiime phylogeny midpoint-root to root the
phylogeny of our 16S amplicon sequences. Finally, we used qiime
diversity alpha-rarefaction on all samples and we set the
–p-max-depth to 10,000. We removed samples with 5,000 or less from
our analyses.
We compared the beta diversity (qiime emperor plot ) of proximal
and distal gut microbiomes of the San Salvador samples with a two-tailed
paired t -test and found no significant differences between
proximal and distal regions of the gut microbiome (P = 0.29).
Therefore, we merged the proximal and distal samples for each individual
from San Salvador Island, resulting in 48 samples. We also removed oneCualac tessellatus sample because of low read count (129 reads;
Figure S1).
We used the classifier Silva 132 99% 515F/806R
(silva-132-99-515-806-nb-classifier) for training in identification of
taxa from our samples. Afterwards the following files generated in QIIME
were used in R (v. 4.0.0) for further statistical analyses: table.qza,
rooted-tree.qza, taxonomy.qza, and sample-metadata.tsv. We used the
following R packages for further analyses: phyloseq (McMurdie and Holmes
2013) and ggplot2 (Wickham, 2016) with the following functions:distance, plot_bar , plot_ordination , andplot_richness . Before conducting any analyses, we removed the
following taxa from our analyses, uncharacterized and Opisthokonta
(eukaryotic sequences mainly due to fish 16S amplicons). We estimated
alpha diversity by using the plot_richness function and the
Chao1 and Shannon’s diversity indices. For beta diversity, we used theplot_ordination function and non-metric multidimensional scaling
(NMDS) based on Bray-Curtis distances among samples. Hierarchical
clustering was generated with the distance function along withhclust as part of fastcluster (Müllner, 2013) using the average
linkage clustering method. The plot_bar function in the phyloseq
package was used to visualize relative abundance of taxa. In our taxa
plots we removed abundance counts of less than 400 from our analyses. We
used ggplot2 to generate all figures (Wickham, 2016). Lastly, we
used the Linear discriminant analysis Effect Size (LEfSe version 1.0;
Segata et al. 2011) algorithm to identify microbial taxa that were
significantly enriched in each of our specialists (scale-eater and
molluscivore) in comparison to all other samples. This analysis was used
to determine the features (i.e. organisms, clades, operational taxonomic
units) to explain differences in assigned metadata categories. We used
the nonparametric factorial Kruskal-Wallis rank-sum test to detect taxa
with significant differential abundances between specialist samples and
all generalist samples (scale-eaters versus generalists + molluscivores,
molluscivores versus generalists + scale-eaters). We then used a
Wilcoxon test for all pairwise comparisons between taxa within each
significantly enriched class to compare to the class level. From the
standard and gut length measurements, we used ANCOVA in R (v. 4.0.0) to
test whether there was a significant difference among species based on
gut length and standard length.
Lastly, we used generalized linear models (GLMs) in R to test the
effects of diet (generalist, scale-eater, molluscivore), the fixed
effect of location (Osprey Lake, San Salvador Island; Crescent Pond, San
Salvador Island; Lake Cunningham, New Providence Island; Fort Fisher,
North Carolina; and San Luis Potosí, Mexico), and their interaction on
the response variables of principal coordinates axes 1 and 2.