Materials and Methods
Stomach samples
collection
Stomach content samples (\(n=27\)) were collected from fiveCampylomormyrus species with markedly different EOD and snout
morphologies (C. alces , \(n=2\); C.compressirostris ,\(n=10\); C.curvirostris , \(n=2\);C. numenius , \(n=1\); C. tshokwe , \(n=9\); see Lamanna
et al. 2016 for phylogenetic relationships, EODs, and snout
characteristics). Additionally, stomach content samples ofGnathonemus petersii (G. petersii, \(n=3\)), the sister
genus of Campylomormyrus , were used for comparison. UnlikeCampylomormyrus , Gnathonemus petersii has no snout, but a
trunk-like protrusion on the head. The fish specimens from which these
stomach content samples are extracted were collected during an
expedition to the Republic of the Congo in fall 2012. All the analyzed
stomach samples were extracted from sympatric fishes live in the Congo
River rapids south of Brazzavilla [S4°18.788′ E15°13.790′]. Before
stomach extracting, all fish individuals were weighed and measured for
total and standard length and were assigned unique and museum
identifiers (see Table S1.1 in Supplementary File S1). After dissection,
the stomach samples were stored in Queen’s lysis buffer solution (Seutin
et al.1991). Before DNA extraction, the stomach contents were visually
checked and found to be heavily digested, with very few visible tissue
parts remaining. Handling of animals was conducted at the University of
Brazzaville in accordance with the relevant national guidelines and
regulations guidelines for the care and use of animals for scientific
purposes.
Lysis and DNA extraction
Each stomach content sample was homogenized using a Tissue Tip
Homogenizer before incubating with lysis reagents. The Qiagen DNeasy
Mericon Food Kit for extraction of high-quality DNA from digested foods
was used to carry out the DNA extraction from the homogenized samples.
Since the stomach content samples were almost digested, the DNA was
suspected to be highly fragmented. Therefore, the small fragment
protocol of this kit (200 mg) was used according to the manufacturer’s
protocol. Initial attempts to directly amplify DNA with established DNA
barcoding primers were unsuccessful (Lamanna & Tiedemann, unpubl.
results).
Libraries preparation, indexing and
amplification
After DNA extraction, all samples underwent single-stranded DNA library
preparation using T4 DNA ligase following (Gansauge et al. 2017). The
procedure is optimized to highly degraded DNA fragments, as expected in
digested food from the stomach. An additional step of deoxyuracils
removal using Uracil DNA glycolase was included. Deamination of cytosin
to uracil is a typical damage pattern in degraded DNA and could cause
erroneous mispairing to adenin, resulting in genotyping errors.
After library preparation, the optimal number of Polymerase Chain
Reaction (PCR) cycles required for library indexing and amplification
was estimated by Quantitative Polymerase Chain Reaction (qPCR). The
rationale is to adjust the number of PCR cycles during the library
amplification to different DNA concentrations (Basler et al. 2017). The
reaction was performed by mixing 1ml of a 1:20 dilution of each library
with 3µl of nuclease free water, 5µl of 2 x SYBRTM green qPCR master mix
(ThermoFisher Scientic), 0.5µl of qPCR primer IS7, and 0.5µl of primer
IS8 (both 10µM; from Gansauge & Meyer 2013). For each library three
reaction replicates were analyzed. The PCR reactions were performed with
7min at 95°C, followed by 40 cycles of 10s at 95°C denaturation, 30s
annealing at 60°C, and 1min extension at 72°C. Considering the
differences in DNA concentration between the qPCR and the indexing PCR,
the optimal number of cycles for the indexing PCR was calculated
according to Basler et al. (2017).
The indexing primers P5 and P7 were chosen according to the library
protocol of Gansauge et al. (2017). The reaction mix for each sample
contained 44.8µl of nuclease free water, 8µl of 10 x AccuPrimeTM Pfx
reaction mix (ThermoFisher Scientic), 0.8µl of 2.5U AccuPrimeTM Pfx
polymerase, 3.2µl of P5 indexing primer, 3.2µl of P7 indexing primer
(both 10 µM) and 20µl of each library. Thermocycling started with 2min
at 95°C, followed by the number of cycles calculated according to Basler
et al. (2017) consisting of 15s at 95°C denaturation, 30s annealing at
60°C and 1 min extension at 68°C. After indexing and amplification, all
libraries were purified using the MinElute PCR Purification Kit (QIAGEN)
following the manufacturer’s instructions. The libraries were quantified
with a Qubit 3.0 fluorometer after purification and the distribution of
the DNA fragments lengths was measured using an Agilent 2100 bioanalyzer
system.
Probe design and DNA hybridization
capture
For probe design, we selected target species based on literature
research on the potential taxa eaten by Campylomormyrus and the
expected diversity of prey items in the Congo River region. We targeted
504 species across 6 animal taxa: Arthropoda, Annelida, Mollusca,
Nemertea, Nematoda, and Rotatoria. Additionally, we targeted 29 species
from 3 plant taxa: Streptophyta, Rhodophyta, and Chlorophyta. For these
species, mitochondrial COI gene (animals) resp. chloroplast rbcL gene
(plants) sequences were retrieved from the GenBank sequence database of
the National Center for Biotechnology Information (NCBI). When a species
of interest did not have sequence information available, sequence
information from another species of the same genus was included. If this
was not available, the taxon was excluded. The average sequence length
of all collected sequences was 700 bp. A complete
list of taxonomic names and NCBI
accession numbers can be found in the Supplementary File S2. The
sequence capture probes were custom-made at Arbor Biosciences (formerly
MY croarray) and supplied in form of a MYBaits InSolution Custom Target
Capture Kit (designs with 1-20k probes). The probes were designed to
achieve a high overall probe count and an increased coverage in high
GCcontent regions. The final probe set contained 15284 probes, each with
a length of 80bp. The synthesized probes comprise short overlapping
fragments representing the whole probe template sequences of COI resp.
rbcL.
The hybridization capture enrichment reactions were performed according
to the manufacturer’s instructions (Biosciences 2016). Following these
instructions, the recommended input of the library DNA was 100ng to
500ng. This was equivalent to 7µl of the library material with a DNA
concentration of 14 to 72ng/µl. All materials from the library were used
for two rounds of hybridization capture reactions, thereby increasing
specificity (Krueger et al. 2021). The captured and amplified libraries
of the first round of capture reaction were used as input libraries for
the second round. For both rounds of capture, the libraries were not
pooled since pooling might produce low quality results (Biosciences
2016). The second captured and amplified libraries were pooled and
underwent sequencing on an Illumina MiSeq instrument using a protocol
for 2x150bp paired end of double indexed libraries (Biosciences 2016).
Bioinformatics, data analysis, and
statistics
Before further analyzing the high throughput sequencing reads, which
were generated in FASTQ format, we performed a quality processing to
remove the adapter sequences and the low-quality reads, using Cutadapt
version 2.10 (Martin 2011). Quality cutoff, minimal sequence length, and
minimal overlap between sequence and adapter were set to 20, 30, and 3,
respectively. Then, we performed quality control checks to the filtered
sequences using FastQC (Andrews 2010) to ensure their suitability for
further analysis.
Taxonomic classification for the filtered sequences was done using
Kraken version 2.0.9 (Kraken2; Wood et al. 2019). Kraken compares the
reads of the metagenome to short sequences of length \(k\), the
so-called \(k\)mers, from a database that is associated with the
sequence information underlying the tree of life phylogeny. The
algorithm then places the read on the tree of life based on its
similarities to these \(k\)mers. Before applying Kraken2, a custom
marker database of available eukaryotic COI and RBCl sequences from NCBI
was built to map the reads. The results of Kraken2 were visualized and
inspected using the online tool Pavian metagenomics data explorer
(Breitwieser & Salzberg, 2019).
The read counts were used to estimate the relative abundance of food
taxa in the diet of each sample. For any sample, these count data were
used to record both the occurrence (presence/absence) of a taxon and the
percentage of reads assigned to that taxon. The frequency of occurrence
(%FOO) is used to quantify occurrence across samples. The relative read
abundance (RRA) is used as a proxy for relative biomass consumed.
Details on the calculations of these terms are given in the
Supplementary File S7.
The dietary niche width was assessed by calculating the Shannon
diversity index (alpha diversity) for each sample using the spaaR package (Zhang 2016). Further, we used two complementary metrics to
describe the diet overlap among species: Schöner index (Schoener 1970)
and Pianka index (Pianka 1974). Pairwise calculations of these indices
were performed using the spaa R package (Zhang 2016). The
equations of the indices are given in the Supplementary File S7. We used
the EcoSimR R package (Gotelli et al. 2015) to compare the
results with reference to 1000 permutations of a null model that holds
the same dietary niche width, while randomizing the values for the diet
items. We further calculated Bray–Curtis dissimilarity indices (beta
diversity) between each pair of samples using thevegdist -function in the vegan R package (Oksanen et al.
2019). Additionally, we performed nonmetric multidimensional scaling
(NMDS) using the function metaMDS in the vegan R package
to visualize the patterns of dietary dissimilarity among samples. We
tested for differences in diet composition among species by performing a
permutational multivariate analysis of variance (perMANOVA) with 999
permutations, using the adonis -function in the vegan R
package. To further test for the influence of our two divergent
phenotypic characters EOD and snout morphology on the food selection, we
grouped the stomach samples as follows: Based on the EOD, we formed two
groups: Long EOD (>2ms) and short EOD (<0.3ms;
note the almost 10fold difference in EOD length among the two groups).
The first group (\(n=12\)) included the samples of C. alces ,C. numenius , and C. tshokwe . The second group (\(n=15\))
included the samples of C. compressirostris , C.
curvirostris , and G. petersii . EOD categorization was based on
Lamanna et al. (2016). Further, the samples were grouped based on the
snout length into three groups: Long snout (C. curvirostris ,C. tshokwe , and C. numenius , \(n=12\), medium snout
(C. compressirostris , \(n=10\)), and short snout (C.
alces and G. petersii , \(n=5\)). For this classification, we
took advantage of a previous geometric morphometrics analysis, which
stratified the respective species according to snout length along
principal component 1 (Figure 3 in Lamanna et al. 2016). Differences in
diet among groups were assessed with a perMANOVA as above.