3.5- PCR replication and taxon accumulation.
To assess the extent of low abundance and possibly unique taxa in
PCR replicates, we calculated increases in α diversity as PCR replicates
are added to a combined data set. Although the order in which PCR
replicates are added will not influence cumulative α diversity, the
trend to this endpoint will vary. We therefore bootstrapped the analysis
100 times and plotted the mean. Notably, after the addition of all 24
PCR replicates, we did not observe a plateau in species richness,
indicating that even this large number of PCR replicates was
insufficient to fully sample the diversity of taxa within the DNA
extract (Figure 5).
We next calculated the number of PCR replicates per sample needed to
reach the point at which the taxon accumulation curve is saturated, or
at which it increases by fewer than one taxon on average (based on our
bootstrapped analysis) when another PCR replicate is added (Table 2). We
performed this analysis at different sequencing read depths and read
cutoffs. The number of added PCR replicates necessary to achieve
saturation of the taxon accumulation curve varied between sites,
sampling read depth, and minimum read cutoff, although fewer replicates
were necessary to reach saturation at higher read cutoff (Table 2).
Increasing the rarefaction read depth surprisingly increases the number
of replicates required (Table 2).
We then plotted histograms of the frequency of taxa detected across
PCR replicates (Figure 6). Most taxa are either singletons (present in
only one PCR replicate) or occur in all PCR replicates. Based on PITS
data, singletons did not appear to be sequencing artefacts because out
of the 70 singleton species found, only eleven occurred within the same
genus as another species found at high frequency (found in at least 20
replicates) (see Chlamydomonas ; Table S1). To evaluate if
singleton taxa were also low relative abundance taxa, we plotted the
relationship between a taxon’s within-replicate sequence abundance and
its frequency across replicates (Figure 7). For all read depths and
minimum read cutoffs, we find a significant positive correlation (Figure
7), as indicated with a fitted linear model (PITS: p<2e-16,
T=24.73, adjusted r2= 0.7324; FITS: p<2e-16,
T=39.91, adjusted r2=0.8219). This indicates that taxa
that occur at low sequence abundance within PCR replicates also occur
less frequently across replicates, and that taxa that are abundant
within PCR replicates are more likely to occur in all PCR replicates. In
PITS results, only when a taxon’s relative abundance is over roughly
10% does it occur in most replicates (Figure 7). In FITS results, only
when a taxon is over 1% does it occur in most replicates. Most taxa in
soil and sediments were at below 1% relative abundance (Figure 7).