2.3. Statistical analyses
Raw sequence data were analyzed using the R 4.0.1 platform using
“dada2” packages for the analysis of the 16S rRNA gene sequence
(https://benjjneb.github.io/dada2/tutorial.html) (Callahan et al. 2016).
Briefly, the adapters and primer sequences were first removed from raw
sequence data using “cutadapat.”
Moreover, clean sequences underwent
trimming and merging. Amplicon sequence variants (ASVs) were derived
following the removal of chimeric sequences, and their categorization
was achieved using the Silva database release 138 to attain taxonomic
insights (Quast et al. 2013; Yilmaz et al. 2014). The ASV table was
subsampled to the minimum requisite sequence count for subsequent
statistical assessments. Calculation of α-diversity (Shannon and Chao1
indices) was executed with the “microeco” and “vegan” packages (Liu
et al. 2021, Oksanen J et al. 2022). The α-diversity and community
composition visualizations were produced using Origin 2020 and the
“ggplot2” packages in R (Pingram et al. 2019). Non-metric
multidimensional scaling (NMDS) based on Bray-Curtis distances was
performed using “micreco” packages to visualize the similarity between
samples.
To analyze the community composition of Bathyarchaeia , a
phylogenetic tree was constructed employing reference sequences from a
prior study to classify the Bathyarchaeia subgroup (Zhou et al.
2018). The outgroup sequences belonged to Crenarchaeum
(Cenarchaeum symbiosum ) and Nitrosoarchaeum
(Nitrosoarchaeum koreensis ). These reference sequences
encompassed 15 Bathyarchaeial subgroups (Zhou et al. 2018). ASVs
affiliated with Bathyarchaeia , as per the Silva 138 database,
were also selected. The construction of the phylogenetic tree was
executed within the MEGA11 platform (Tamura et al. 2021). The alignment
of all sequences was performed using ClustalW, and the Maximum
Likelihood tree was employed for the construction, with a Bootstrap
analysis (1000) being carried out to evaluate tree topology (Zhou et al.
2018). Based on the tree, the subgroup information of Bathyarchaeial
ASVs was obtained and used for downstream statistical analyses. ArcMap
software was used to predict and visualize the large-scale distribution
pattern of Bathy-6 across eastern China paddy soils for the
analysis of predictive atlas maps. The Kriging interpolation method was
used to estimate the relative abundance of Bathy-6 across the
whole map after the input of site information, including geographical
coordinates and the relative abundance of Bathy-6 . Further, the
predictive maps were obtained using a province mask. For the heatmap of
Bathyarchaeial ASVs, the figure was constructed using Evolview
(Subramanian et al. 2019).
To investigate the determinism and stochasticity in influencing archaeal
and Bathyarchaeial community structure, the Sloan neutral community
model (NCM) was used to determine the effect of stochasticity on the
archaeal and Bathyarchaeial community assembly using the “Hmisc”
package (Sloan et al. 2006, Harrell & Dupont 2019). The “spaa”
package was used to evaluate the width and overlap of the niche (Zhang
2016). A cognitive assessment was
employed to ascertain the connection between environmental factors and
microbial communities with the utilization of the “linkET” package.
Structural equation modeling (SEM) was employed to quantify the direct
and indirect influences of environmental factors on the shaping of both
the archaeal and Bathyarchaeial communities, utilizing SPSS and AMOS
software. To elucidate the correlational association between
environmental factors and the relative abundance of Bathyarchaeial
subgroups, Pearson’s correlation analysis was conducted through the
“microeco” packages. The graphical representations were generated
using Origin 2020.
For the co-occurrence network analysis, Spearman’s correlation
coefficients between ASVs were initially calculated through the
“microeco” packages on the R platform. The Spearman’s correlation
threshold was set at a coefficient > 0.7 or <
-0.7 with a significance level of p < 0.01. Subsequently, the
networks were visualized using Gephi.