1. Introduction
Over two-thirds of the Earth is covered by oceans, likely sheltering a
high level of still poorly studied biodiversity, particularly in the
deep sea (Costello & Chaudhary, 2017; Costello, Cheung, & De Hauwere,
2010). Today, molecular approaches provide nonintrusive methods to study
the diversity of marine environments, even those that are hardly
accessible to sampling. The analysis of environmental DNA (eDNA;
Taberlet, Coissac, Hajibabaei, & Rieseberg, 2012) represents a
promising path to inventory biodiversity and sets the ground for the
development of molecular biomonitoring protocols (Andruszkiewicz et al.,
2017; Apothéloz-Perret-Gentil et al., 2017; Bohan et al., 2017; Cordier
et al., 2018; Derocles et al., 2018). Approaches based on eDNA (air,
ground, sediment, or water, relatively easy to access and sample) target
the genetic material present in the environment (Bohmann et al., 2014;
Thomsen & Willerslev, 2015), allowing us to unravel the nature of
macro- and microorganisms present in the surrounding habitats. First
developed for the uncultivable majority that represents the microbial
world (Xu, 2006), metabarcoding approaches, relying on PCR-based
amplicon sequence identification combined with high-throughput
sequencing, were transferred to eukaryotes early on (Creer et al., 2010;
Hajibabaei, Shokralla, Zhou, Singer, & Baird, 2011; Taberlet et al.,
2012; Valentini, Pompanon, & Taberlet, 2009).
Over the last decade, metabarcoding protocols have been improved, from
sampling up to bioinformatic steps, to optimize their resolution and
interpretation. Nevertheless, biomonitoring and biodiversity inventory
using metabarcoding are challenging (Miya et al., 2015; Yamamoto et al.,
2017) for two main reasons. First, this method relies on PCR-based DNA
enrichment, suffering biases due to unequal amplification across taxa
(PCR bias) and artifacts (PCR errors prominent to sequence errors),
leading to biased biodiversity inventories (Acinas, Sarma-Rupavtarm,
Klepac-Ceraj, & Polz, 2005; Kanagawa, 2003; Sefc, Payne, & Sorenson,
2007; Smyth et al., 2010). Second, the evolution of high-throughput
sequencing technologies available on the market led to higher yield but
shorter sequencing fragments, limiting the use of metabarcoding to short
fragments. Such short fragments (usually 150 to 450 base pairs) lead to
a less reliable assignment of sequences to taxa and hamper the use of
data produced for comprehensive phylogenetic reconstructions. This is
particularly limiting in ecosystems where biodiversity is poorly
described and reference databases contain large gaps, for which many
unassigned sequences can correspond to existing undescribed
biodiversity, yet teasing them apart from spurious sequences would
require phylogenetic reconstruction. The limiting factor for taxonomic
assignment of deep sea organisms is the general lack of sequence
references in marine systems. Some major groups, such as nematodes,
which are the most abundant and diverse benthic metazoan taxa, can
rarely be identified genetically (Dell’Anno, Carugati, Corinaldesi,
Riccioni, & Danovaro, 2015; Gambi & Danovaro, 2016). Thus, long,
high-quality barcode libraries are needed to improve taxonomic
identification in general, especially for poorly known groups.
Theoretically, direct metagenomic sequencing (such as shotgun
sequencing) could solve these limitations, as these sequences can also
be reconstructed from eDNA to obtain a comprehensive overview of the
taxonomic diversity of the studied community, free of PCR bias (Porter
& Hajibabaei, 2018) and allowing reliable phylogenies based on long
fragments. However, the production of metagenomes is still extremely
costly, leading to a dominance of prokaryotic sequences, and thede novo reconstruction of comprehensive metagenomes is highly
time consuming; differentiating between biological differences and
sequencing errors is hardly possible and highly limited by gaps in
reference databases (Ghurye, Cepeda-Espinoza, & Pop, 2016; Quince,
Walker, Simpson, Loman, & Segata, 2017).
As an intermediate, less expensive option, to avoid the two main
limitations associated with metabarcoding, two other methods of DNA
enrichment are available (Mamanova et al., 2010; Mertes et al., 2011):
the molecular inversion probe (MIP) and capture by hybridization (CBH).
CBH exists in two variations, ”on-array capture” on a solid microarray
or ”in-solution capture”, which takes place within a fluid medium
(Gasc, Peyretaillade, & Peyret, 2016). Here, we use the latter, which
was first described by Gnirke et al. (2009) for human exome
resequencing, whereby hybrid probes are designed to enrich genomic DNA.
While the initial cost of this system is high, by multiplexing
libraries, efficient sequencing of several samples (up to 96-well
plates) has been shown to be highly efficient (Meyer & Kircher, 2010).
Moreover, a diversity of probes (single-stranded sequences of DNA)
designed in different locations of the target gene regions allows
capturing a much wider diversity and recovering long fragments, thereby
improving taxonomic assignment and allowing reasonable phylogenetic
reconstruction (Denonfoux et al., 2013; Gasc & Peyret, 2018).
Furthermore, a low concentration DNA template is sufficient, allowing
this method to be successfully used in low biomass environments (such as
air or deep-sea biomes) wherein generally lower DNA concentrations are
obtained, as in deep oligotrophic aquifers (Ranchou-Peyruse et al.,
2017). The first test using eDNA showed that a 100-fold lower
concentration can be detected with CBH than with traditional methods
(Seeber et al., 2019), while others mentioned reduced tractability for
DNA with less than 0.1 ng of total gDNA (Wilcox et al., 2018). Testing
within complex prokaryotic communities even allowed the detection of
extremely rarely represented members (less than 0.0001%) (Gasc &
Peyret, 2018). It has been suggested that the final success of this
method depends strongly on the probes (Ribière et al., 2016) rather than
on the initial biomass and DNA concentration.
Improved biodiversity assessments can thus be expected using CBH, (i)
avoiding PCR steps, yet targeting a broader range of biodiversity in a
single reaction by using a comprehensive and versatile set of probes and
(ii) reconstructing long fragments for full barcode regions, allowing
reliable phylogenetic positioning and reconstruction. In recent years,
this methodology has proven to markedly improve microbial diversity
inventories with precise taxonomic affiliation at the species level
(Gasc & Peyret, 2018). CBH was also applied to recover full-length
microbial eukaryotic cDNAs in complex environmental samples (Bragalini
et al., 2014) and to directly capture long DNA fragments (Gasc &
Peyret, 2017). Additionally, CBH using mitochondrial barcodes for
inventories of metazoans in bulk or ethanol-preserved samples resulted
in a highly accurate census of species (Gauthier et al., 2020; Shokralla
et al., 2016), and similar results were obtained when testing the
detection of a broad range of metazoans, including mammals, from aquatic
and sediment eDNA samples (Seeber et al., 2019; Wilcox et al., 2018).
Additionally, CBH represents a promising path for phylogenetic studies,
as recently shown for butterflies (Kawahara et al., 2018).
With this study, we aimed to assess the potential of 16S and 18S rDNA
enrichment by CBH coupled with high-throughput sequencing to explore the
biodiversity of prokaryotes and eukaryotes, including metazoans, in the
deep sea (~500-2800 m depth). We analyzed eDNA samples
extracted from sediment to compare CBH with metabarcoding for the V4 16S
rDNA region and the V1-V2 18S rDNA region.