4.1. Perspectives on improvement
Taking full advantage of the potential of CBH will require several
important improvements to focus on biodiversity inventories based on
long and resolutive fragments: adapting sequencing depth and testing ofde novo DNA reconstruction.
In fact, the reconstruction was done here by mapping sequences to the
existing SILVA databases using the program EMIRGE (Miller et al., 2011).
This program developed for 16S has been shown to be extremely efficient,
with 2224 full barcodes reconstructed, leading to 6132 detections in ten
samples. Theoretically, a similar result could be expected for
eukaryotic 18S, yet the numbers of identifications with EMIRGE were less
than 6% compared to those with short CBH with Kraken2. To exclude the
possibility that EMIRGE is not sufficient for 18S reconstruction, we
also tested the program MAFFT for reconstruction (Katoh, Misawa, Kuma,
& Miyata, 2002), which was also shown to be suitable, with similar
results (data not presented). Thus, enhancing the sequencing depth is
the main solution to improving the reconstruction of long fragments
based on CBH.
For eDNA analysis in general, high sequencing depth is crucial (Singer,
Fahner, Barnes, McCarthy, & Hajibabaei, 2019) and will allow de
novo assembly instead of mapping algorithms, improving species
identification (Deiner et al., 2017) and the detection of taxa or groups
absent from the reference dataset. In this first study, both gene
regions were pooled for sequencing, and the results suggest i) that the
16S probes were more efficient than the newly designed 18S probes in
uncovering community richness, ii) the dominance of prokaryotic biomass
was reflected, or iii) the method suffered from both limitations. The
unbalanced biomass could be fixed by sequencing libraries from each rDNA
separately. However, most bacterial diversity may be revealed with lower
sequencing depth for these unicellular organisms, while the higher
variations in body size and of the number of rDNA 18S copies among
metazoans may result in a more uneven distribution of the number of
fragments among taxa. In any case, a higher sequencing depth will be
needed to unravel the diversity of the communities they form through
full barcode reconstruction. Setting a generic number of sequences would
not be realistic because the optimal number strongly depends on the
diversity and biomass of the standing stock in the sediment. When
studying new areas, pilot metabarcode studies may help tune the
sequencing depth, allowing the optimal inventory of full-length
metabarcodes in the studied ecosystems.
We tested here for the first time a new set of DNA probes that proved
useful and efficient, yet the method still needs improvement to capture
several taxa, such as nematodes and some other meiofauna taxa with low
levels of detection. In fact, meiofauna of deep-sea sediments are
generally dominated by nematodes and copepods in terms of biomass,
abundance, and species richness (Zeppilli et al., 2018). Neither CBH nor
MTB consistently reflects this, suggesting that not only sequencing
depth but also the versatility of the set of probes needs to be
enhanced.
Finally, the huge gaps in knowledge of marine biodiversity, magnified in
the deep sea, result in a paucity of deep-sea sequences in nucleotide
reference databases (Sinniger et al., 2016), inhibiting the efficiency
of reference-based bioinformatic reconstructions (Mendoza,
Sicheritz-Pontén, & Gilbert, 2014). This likely explains the very
uneven reconstruction success across the phyla observed, with rather
good results for Porifera, Annelida, and some Arthropods, while for
other phyla such as Deuterostomia, Molluska, Nematoda, Cnidaria, and
Platyhelminthes, no reconstruction could be obtained despite numerous
identified sequences.