2.4 Data analysis
Read counts from all samples were of the same order of magnitude (18,834 to 44,787 for bacterial dataset and 9726 to 62,613 for fungal dataset). Singletons, and doubletons were first filtered out from both bacterial and fungal datasets. To decrease the noise, taxa with the sum of relative abundance less than 0.001 were removed. This resulted in a core dataset of 693 taxa by 40 samples in bacterial dataset and 364 taxa by 40 samples in fungal dataset. The raw taxa counts were normalized to abundance using Hellinger transformation.
Statistical analyses were performed using R version 3.6.1 (R Development Core Team, 2016). Graphs were plotted with R packages “ggplot2” (Wickham, 2016), “grid” (Murrell, 2005), and “gridExtra” (Auguie, 2017). Two-way analysis of variance (two-way ANOVA) was carried out to test the effect of plant tissue type or radiation level on the richness and diversity of bacterial and fungal microbiota with functionaov in “stats” package (R Development Core Team, 2016). Type 1 error rates had a Benjamini-Hochberg (FDR) p value correction performed for ANOVA models with function p.adjust in “stats” package (Benjamini & Hochberg, 1995; R Development Core Team, 2016; Veach et al., 2019). Significant differences between the microbial populations were further compared using Tukey’s honestly significant difference (HSD) test with functionHSD.test in “agricolae” package (Mendiburu, 2019).
The distance matrices of community composition (Hellinger-transformed OTU read data) of endophytic fungi were constructed by calculating dissimilarities using Bray-Curtis method (Faith, Minchin, & Belbin, 1987). Non-metric multidimensional scaling (NMDS) was used to visualize the community composition dissimilarity of endophytic bacteria or fungi among the different plant tissues or radiation levels usingmetaMDSfunction in “vegan” package (Oksanen et al., 2016). Analysis of similarities (ANOSIM) was applied to statistically test the significant differences in microbial composition between plant tissues or among radiation levels. Permutational multivariate analysis of variance (PerMANOVA) with 999 permutations was implemented with adonis in “vegan” package to investigate the environmental influence on microbiota composition.
The effect of different environmental factors (explanatory variables) on endophyte abundance or richness (genus level for bacteria and species level for fungi) was tested using Poisson generalized linear models (GLM) with stepwise selection by AIC. This analysis was performed using function glm in “stats” package and function stepAIC in “MASS” package (R Development Core Team, 2016; Veach et al., 2019; Venables & Ripley, 2002). The data distribution was tested with function shapiro.test in “stats” package. All data were calculated with Poisson distribution and overdispersion in data was tested with function qcc.overdispersion.test in “qcc” package (Scrucca, 2004). Type 1 error rates were FDR-corrected with the method mentioned above.
Co-occurrence analysis was applied on bacterial and fungal datasets separately and collectively with function cor.test in “stats” package (R Development Core Team, 2016). The co-occurrence networks were visualized with “igraph” package (Csardi & Nepusz, 2006). Network characteristics were determined using functions in “bipartite” package (Dormann, Fründ, Blüthgen, & Gruber, 2009).
Intra-genus genetic diversity of bacteria and fungi from control and three treatment levels were evaluated by computing the pairwise distances of DNA sequences within the groups. Only the ASVs assigned taxonomy at the genus level for bacteria and fungi were included in the analysis. Pairwise distance was calculated among all ASVs available in a certain genus from one treatment level using “K80” model withdist.dna in “ape” package (Paradis & Schliep, 2018). One-way ANOVA was applied to test the difference significance of intra-genus genetic diversity, as well as all sequence distances regardless of genera, among four treatments.