3.1- Data summary
We generated an average of 78,809 PITS (range: 9,352-282,57; Table
S1) and 88,987 FITS (range: 15,409-382,888; Table S2) sequences for each
of our 288 amplicon libraries (24 PCR replicates for each of six
extracts, two primer sets). Following adapter removal and quality
trimming, we retained an average of 37,640 PITS reads (range: 6,148-
166,279; Table S1) and 63,436 FITS reads (range: 12,360-323,310; Table
S2 ) per PCR replicate. We were unable to assign to taxa an average of
2.3% of PITS and 76.1% of FITS quality trimmed reads. The large number
of unassignable reads in the FITS data are probably the result of
reference database incompleteness, which is a known problem for this
taxonomic group and one that we attempted to mitigate by increasing the
target depth of sequencing for our FITS libraries.
We found little evidence of contamination introduced during sample
processing, and no evidence of index hopping between libraries during
sequencing. Of the eight extraction negative control libraries and four
PCR negative control libraries per primer that each had between 2,684
and 169,767 reads (Table S1 and S2), species were not shared between
control and true samples. We found no evidence of contamination in our
PITS data set. The FITS extraction negative control libraries contained
a maximum of 11 reads that matched an “unidentified environmental”
fungus. We removed all reads from the PCR amplicon libraries that were
assigned to this “unidentified environmental” fungus. The PCR negative
control libraries generated using both the PITS and FITS primers
included only primer dimer chimeras. The three spiral ginger libraries
used to track index hopping generated 52,675-299,400 sequences. All
ginger sample sequences assigned to plant taxa aligned to Costus
pulverulentus , and no sequences assigned to barcodes for libraries
other than the ginger sample libraries were assigned to Costus ,
indicating there was no index hopping.