Data analysis
Rarefaction : Statistical analyses were conducted using the
programming language R version 4.1.1
(Team, 2021). Rarefaction
curves from the vegan package
(Oksanen et al., 2018) were
produced to find the number to which to rarefy for sufficient sequencing
depth. For microbial analyses and visualization of results, the phyloseq
and ggplot2 packages were used
(McMurdie & Holmes, 2013);
(Wickham, 2009). For initial
sample analyses, all samples were combined. They were pruned and
rarefied to 4,000 sequences per sample leading to 1,606 samples. For
sample-type specific analyses, all samples were divided into one of
seven sample types, pruned, and rarefied at a sample-type specific value
(Blood: 4,030 sequences, Buccal: 10,961 sequences, Cloaca: 10,329
sequences, Intestines: 9,094 sequences, Gizzard: 22,306 sequences,
Liver: 8,658 sequences, Spleen: 11,701 sequences). Rarefaction curves
are available in the supplementary material (Supplemental Fig. S1).
Taxonomic composition: Taxonomy was assigned using the
Qiime2 naive Bayes feature classifier trained against the Greengenes
13_8 reference (DeSantis et
al., 2006). Phylum level taxonomic information was exported to
Microsoft Excel version Microsoft Office Professional Plus 2016 to
create bar plots and determine phylum percentages; phyla represented at
less than 10% of the sequences in each distinct sample were grouped
into the “other phyla” category.
Alpha diversity: We calculated alpha diversity using the
Observed number of ASVs (richness), the Shannon diversity index
(Shannon & Weaver, 1949),
and the Simpson Diversity Index
(Simpson, 1949); these were
calculated using the phyloseq package
(McMurdie & Holmes, 2013).
The three alpha diversity measurements that will be used are observed
ASVs, the Shannon index
(Shannon & Weaver, 1949),
and the Simpson index
(Simpson, 1949). Observed
ASVs informs how many different types of operational taxonomic units
(OTUs) are within the sample without accounting for abundance. Both the
Shannon index and the Simpson index account for both the quantity of
distinct OTUs and the abundances of those OTUs. However, the Shannon
Index quantifies the amount of randomness within the dataset by
considering the amount of distinct OTUs and how evenly the abundance of
sequences are distributed across them and the Simpson index investigates
the proportion of OTUs in the samples.
Beta diversity: Non-metric Multidimensional Scaling (NMDS)
visualized unweighted UniFrac and weighted UniFrac distances
(Lozupone & Knight, 2005).
Statistical significance was determined using the adonis2 function
(PERMANOVA) within the vegan package
(McArdle & Anderson, 2001);
(Oksanen et al., 2018).
Differential abundance: Plots showing the logarithmic
differences in the quantities of taxa between sample types were created
using the package DESeq2 in R
(Love, Huber, & Anders,
2014). Each sample type pair was rarefied to the lower rarefaction
point used for the sample types separately (i.e. comparisons between
Blood and Buccal samples were rarefied to 4,030 sequences per sample).
An alpha cutoff of 0.01 was used. Venn diagrams were created using
BioVenn (Hulsen, 2021).
Phylogenetic comparative analyses: BirdTree.org was used
to create phylogenetic trees of the bird species represented in each
sample type (Fig. 1) (W.
Jetz, Thomas, Joy, Hartmann, & Mooers, 2012),
(Walter Jetz et al., 2014).
Hackett All Species was used within this tool
(Hackett et al., 2008). To
quantify phylogenetic signal in the microbiome, Pagel’s lambda was
estimated for the three alpha diversity metrics (observed ASVs (after
rarefying), Shannon and Simpson), using phytools in R and the phylogeny
of the species estimated from birdtree.org
(Revell, 2012).
To compare two host traits (bird size and microbiome diversity) while
controlling for phylogeny, Phylogenetic Generalized Least Squares (PGLS)
was performed independently on each of the rarefied sample types.
Average weight for each species was calculated using averaged field
measurements of all birds sampled within each species in the entire
dataset. For species that had no recorded weight, the average species
weight in Birds of the World
(Billerman, Keeney, Rodewald,
& Schulenberg, 2020) was used. Following that, the samples were
divided by body site and the samples were rarefied as described above.
Alpha diversity metrics were calculated for each sample and then
averaged for every species, resulting in one Observed, one Shannon, and
one Simpson value for each species within each body site.
First, each alpha diversity metric was plotted against the average
species weight without controlling for host phylogeny. Pearson’s
correlation coefficient, measuring the linear correlation between two
variables, was used to determine the correlation between the two
variables with a 95% confidence interval
(Freedman, Pisani, & Purves,
2007). Next the same variables were tested against each other while
controlling for phylogeny using a Phylogenetic Generalized Least Squares
analysis in R using the package Geiger
(Pennell et al., 2014).
Because the phylogenetic signal of bird size varies when analyzing size
across orders and within orders
(Harmon et al., 2010), each
sample type was further subset to include only the largest bird order,
Passeriformes. The new subsets were tested using the same methods as
above.