Bioinformatics
Bioinformatic processing was performed in QIIME1 (Caporaso et al. , 2010) and QIIME 2 (Bolyen et al. , 2019). Samples were demultiplexed according to their unique tag/index combinations (Supporting Information Table S1), which were removed during the process, together with the primer sequences (Supporting Information Table S2). For subsequent analyses only forward reads were used, as reverse reads often suffer from a lower PHRED quality and due to length differences within the ITS gene region, which often prevents merging of both reads. Three S. chirindensis samples which contained less than 10 000 sequences were removed prior to downstream bioinformatic analyses. The raw sequences from the remaining 183 samples were subsequently passed through deblur (Amir et al. , 2017) implemented in the QIIME 2 pipeline, which assigns raw sequence reads to Amplicon Sequence Variants (ASVs). Reads were trimmed at 180 bp. The UNITE database was used as reference sequences (version 8; 020219) (https://unite.ut.ee/). Only ASVs which were classified as belonging to the Kingdom Fungi were retained. Fungal ASVs were written into a feature table, which was used for subsequent downstream analyses. The full ASV feature table and all metadata relating to the manuscript can be obtained from figshare: 10.6084/m9.figshare.14518200 – ASV feature table and 10.6084/m9.figshare.14518218 – metadata.
Analyses
All analyses were performed on the full, unrarified, ASV table. This was done for two reasons. First, rarefying the ASV table to the smallest sample size to account for differences in library sizes between samples made no difference to the interpretation of the results (results not shown). Second, from a statistical point of view, rarefaction is inept for the comparison of relative abundances (McMurdie and Holmes, 2014; Willis, 2019).
As predictor variables that are highly correlated can lead to spurious effects on analyses, all continuous predictor variables were tested for multi-collinearity prior to analyses (Supporting Information Table S3). When two variables were highly correlated, i.e. r > ǀ0.75ǀ, one of these variables was removed (Supporting Information Table S3). Bush clump area, bush clump tree basal area and bush clump tree species richness were highly correlated. Therefore, only bush clump tree basal area was retained for analyses on endophyte composition and richness, as bush clump tree basal area gives a good representation of available woody tree host density within individual BCs. Bush clump area was only retained in the analyses on successional trends, as BC area was a good proxy for BC maturity and woody vegetation successional stage (Jamison-Daniels et al. , 2021).