HTS data and phylogenetic analysis
The Illumina Miseq sequencing run generated a total number of 4,150,073
reads for all samples. After applied quality filters, dereplication and
chimera removal processes we obtained 2,595,039 DNA sequences. The
clustering process of sequences at 97% identity level resulted in a
total of 15,234 OTUs for all samples. After removing singletons, OTUs
with nonsense codons and not belonging to Bacillariophyta phylum we
obtained 7834 OTUs. Finally, after removing OTUs with a normalized read
value less than 0,005% we obtained 4707 OTUs. The number of OTUs per
sample ranged between 390 and 980, with an average of 634 OTUs per
sample. Taxonomic assignation of OTUs was positive for 3138 OTUs, which
were assigned to 219 species and 90 genera. The number of genera and
species per sample ranged between 31-56 and 58-107 respectively, with an
average of 42 and 80 respectively. Twenty-two taxa were present in all
molecular-analyzed samples, of which three species (Ulnaria acus ,Eunotia bilunaris and Achnanthidium minutissimum ) and two
genera (Gomphonema sp. and Fragilaria sp.) were also
present in the most morphologically-analyzed samples. A total of 1569
OTUs could not be assigned to R-Syst::diatom reference database and
remained unclassified.
According to our phylogeny constructed with 708 reference sequences, we
observed several sequences not placed correctly. This result was more
noticeable in the phylogeny constructed with the same reference
sequences and 3138 taxonomy-assigned OTUs, where some reference
sequences were placed out of their corresponding taxonomy-assigned OTUs.
Accession to rbcL sequences alignments and phylogenetic trees is
detailed at Data accessibility section.