Study group, sampling and
identification
The study group is the megadiverse tribe Sericini that contains
worldwide nearly 4000 described species in about 200 genera . They are
one of the oldest lineages of phytophagous Scarabaeidae and diversified
with the rise of the angiosperms 108 Ma . Sericini are nearly worldwide
distributed, except in Australia, most oceanic islands, archipelagos,
and circumpolar regions . The polyphagous herbivore adults are fully
winged while larvae feed on roots and underground stems of living plants
. Some species are considered as crop pests . Their highly similar
external morphology makes the species difficult to distinguish, but
highly complex male genitalia are well-differentiated between species
and show only little intraspecific variation .
Sampling was conducted during four weeks in April, 2014 by Carolus
Holzschuh and local collectors in the Phou Pan mountain area (Laos, Hua
Phan province) (Fig. 1) (ca. 20°12’N, 104°01’E), at an elevation between
1300 to 2000 meters. Specimens were collected using light traps, by
hand, or netting during daytime. The Phou Pan mountain is situated in
the Indo-Burmese biodiversity hotspot area (Myers et al., 2000) which is
characterized by extremely high endemism. The habitat with its dense
rainforests offers a large variety of plant species for herbivore
insects to feed on. For this study we used only males (1086 specimens),
since they were assignable to distinct morphospecies, while females are
often not distinguishable among closely related syntopically occurring
species. Samples were pinned after DNA extraction, dry mounted,
labelled, and preserved at the ZFMK (Zoologisches Forschungsmuseum
Alexander König, Bonn, Germany) (see Supplement Table 1).
Specimens were sorted to morphospecies by the complex shape of their
copulation organ, i.e., aedeagus, which has been proven to be the best
suited trait system to robustly infer the species entities for this
group . For this purpose, male genitalia of all specimens were
dissected. Habitus and genitalia of each species were photographed with
a stereomicroscope (ZEISS Stereo Discovery.V20) connected to a ZEISS
Axiocam. Presumably undescribed species that were not yet referable to
an available species name, were numbered consecutively (sp1, sp2, etc.).
DNA sequencing
We sequenced the cox1 gene (5′-end) of multiple specimens (3-5)
per morphospecies (in total 190). Lab work followed the standard
protocols of the German Barcode of Life project . DNA was extracted from
mesothoracic leg and attached muscles using the Qiagen DNeasy Blood and
Tissue Kit (Hilden, Germany) or the Qiagen BioSprint96 magnetic bead
extractor (Hilden, Germany).
The PCR reaction was carried out in total reaction mixes of 20 μl,
including 2 μl of undiluted DNA template, 0.8 μl of each primer (10
pmol/μl), and standard amounts of the reagents provided with the
“Multiplex PCR” kit from Qiagen (Hilden, Germany) using primers
LCO1490-JJ [5’-CHACWAAYCATAAAGATATYGG-3’] and HCO2198-JJ
[5’-AWACTTCVGGRTGVCC AAARAATCA-3’] . Thermal cycling was performed
on Applied Biosystems 2720 thermal cyclers (Life Technologies, Carlsbad,
CA, USA), using a PCR program with two cycle sets, combining a
“touchdown” and a “step-up” routine as follows: hot start Taq
activation: 15 min at 95 °C; first cycle set (15 repeats): 35 s
denaturation at 94°C, 90 s annealing at 55°C (−1°C per cycle) and 90 s
elongation at 72°C; second cycle set (25 repeats): 35 s denaturation at
94°C, 90 s annealing at 40°C, and 90 s elongation at 72°C; final
elongation 10 min at 72 °C. Unpurified PCR products were subsequently
sent for bidirectional Sanger sequencing to BGI Tech Solutions
(Hongkong, China).
Raw DNA sequences were assembled (forward and reverse sequence) and
edited in Geneious R7 (version 7.1.3, Biomatters Ltd.) to correct
base-calling errors and to assign ambiguities (when forward and reverse
sequence were not congruent for certain nucleotides). Sequences were
aligned with Muscle (Edgar, 2004) as implemented into Geneious using the
default settings. Primers were trimmed subsequently. All data are
deposited in BOLD (project: SCOIB) and GenBank (accession numbers
MW128167-MW128351) respectively (see Supplement Table 1).
Phylogenetic analysis and species
delimitation
Putative morphospecies were compared with results obtained from the
DNA-based species delimitation methods. We applied Poisson tree process
(PTP) , statistical parsimony network analysis (TCS) , Automatic Barcode
Gap Discovery (ABGD) , distance based clustering, and Barcode of Life
database (BOLD) - Barcode Index Numbers (BINs). These methods were
applied on all sequenced beetles to result in clusters that are
considered molecular taxonomic units (MOTUs) , i.e., DNA-based
species-assignments by the respective method.
A phylogenetic tree was calculated with maximum likelihood from the
final multiple alignment of all DNA sequences using the IQ-TREE web
server (IQ-TREE version 1.6.12; http://iqtree.cibiv.univie.ac.at/)
; the best substitution model (GTR+F+I+G4) was chosen with ModelFinder
(Kalyaanamoorthy et al., 2017) according to Bayesian Information
criterion (BIC). Branch support was calculated by generating 1000
samples for ultrafast bootstrapping (Hoang et al., 2018). The resulting
tree was midpoint rooted in FigTree v1.4.3 (available from
http://tree.bio.ed.ac.uk/software/figtree/). This tree was the
basis for the PTP analysis. Additionally, split networks were generated
using SplitsTree4 v. 4.16.1 to visualize incompatible and ambiguous
signals in the cox1 dataset. In these networks the parallel
edges, rather than the single branches, illustrate splits concluded from
the data.
We used both versions of the Poisson tree process model (PTP) on the PTP
web server (https://species.h-its.org/; accessed on August 5th
2020): bPTP, which adds Bayesian support (pp) values to branches that
delimit species in the input tree, and the refined multi-rate mPTP. PTP
uses the shift in the number of substitutions at internal nodes to
identify branching rate transition points which indicate speciation
events. We used default settings for the bPTP analysis (100000 MCMC
generations, thinning: 100, burn-in: 0.1, seed 123).
Statistical network analysis as performed with TCS v. 1.21 separates the
sequence data into clusters of closely related haplotypes connected by
changes that are non-homoplastic with a 95% probability (Templeton et
al., 1992); if applied to mtDNA the extent of the networks has been
found to be largely congruent with morphospecies
Automatic Barcode Gap Discovery (ABGD) was conducted using the ABGD
webserver (https://bioinfo.mnhn.fr/abi/public/abgd/abgdweb.html;
accessed on August 17th 2020) with default parameters (i.e., using
Jukes-Cantor model (JC69) distances, a relative gap width of 1 and 50
steps, Pmin=0.001, Pmax=0.1, Nb bins for distance distribution= 20).
ABGD partitions individuals for a range of prior intraspecific
distances, instead of using one predefined distance threshold . A robust
result across a range of prior intraspecific distances was chosen as the
best partition scheme. This outcome was also closest to the number of
morphospecies and simultaneously matched the presumptive barcode gap in
the histogram of distances.
Distance based clustering was done with the tclust-function in the
R-package spider (v. 1.5.0; Brown et al., 2012). A threshold of 3% was
applied to the pairwise distance matrix of all specimens that was
corrected with the Kimura model (K80). The logic of this approach
underlies most metabarcoding protocols (Macher et al., 2018; Piper et
al., 2019; Liu et al., 2020), relying on the presence of a barcoding gap
(Elbrecht et al., 2017), which was chosen as a gap at 3 % pairwise
distance by the majority of studies (however, see Beentjes et al., 2019
for an 2 % example). Finally, we compared outcome from species
delimitations to Barcode Index Number (BIN) assignments (Ratnasingham &
Hebert, 2013) in the BOLD data
base (Project - SCOIB: Sericini COI Barcoding).
To check the performance and accuracy of the DNA-based delimitation
methods compared to the a priori morphospecies hypotheses based on the
genital morphology, the match ratio was calculated: Match ratio =
2*Nmatch/(NMOTU+Nmorph).Nmatch is the number of species with exact matches, when
the morphospecies and DNA-based species delimitation results to include
the same specimens. NMOTU is the number of classified
groups by the different delimitation methods and finally
Nmorph is the number of morphospecies. All resulting
MOTUs were mapped onto the phylogenetic tree beside terminal’s labels
(Fig. 2).