Endosymbiont screening and phylogenetic analysis:
All the 390 morphospecies were screened for the three endosymbionts- Wolbachia , Cardinium andArsenophonus . Incidence of each of these endosymbionts was estimated using primers specific to them (Table S2). The Multi Locus Strain Typing (MLST) system was used (Baldo et al., 2006) to identify and characterize the Wolbachia infections. For Cardiniumand Arsenophonus, 16S rRNA gene was amplified using specific primers (Table S2).
To test for the presence of Wolbachia , the wspec primers were used. Samples positive for wspec were then sequenced for one of the MLST genes, usually fbpA , to identify singleWolbachia infections by inspecting the chromatograms for multiple peaks. Samples with multiple Wolbachia infection were not processed further as assigning a particular sequence to a particularWolbachia would have been impossible. Resultant allele sequences from MLST genes were compared with existing sequences in PubMLST database (Jolley, Bray, & Maiden, 2018) to identify their allele profiles (number assigned to each unique sequence) and ST (new strain type as defined by the combination of five MLST allele profiles). Sequences that did not have a match in the PubMLST database were submitted to the database for curation. Sequences obtained from this study were deposited in NCBI and BOLD database (Table S3).
Sequences were aligned with Sequencher 5.2.4 (Gene Codes Corporation) and manually edited with BioEdit v. 7.2.5 (Hall, 1999). DNA sequence evolution models were computed using MEGA7 (Kumar, Stecher, & Tamura, 2016). GTR+g (general time reversible model with γ-distributed rate variation) was found to be the best model for all CO1phylogenetic trees. Bayesian phylogeny was constructed for CO1sequences using MrBayes v3.2.5 (Ronquist et al., 2012). Each phylogenetic analysis was run at least twice and was accepted only if there was no change in the major branching order (Figure S2). Phylogenetic trees were visualized and edited with Figtree v1.4.2 (Rambaut, 2009).
Maximum likelihood phylogenetic trees of Wolbachia, Cardinium and Arsenophonus were constructed in MEGA7 with 1000 bootstrap replicates for each. The suitable substitution models obtained were T92+g+i (Tamura 3-parameter with γ-distributed rate variation and proportion of invariable sites) for concatenated MLST dataset, T92+g for gatB ,hcpA , ftsZ , fbpA gene, HKY+g (Hasegawa, Kishino, and Yano) for coxA and K2+g (Kimura 2-parameter) forCardinium and Arsenophonus .
To account for the frequent recombination seen in Wolbachiagenomes, ClonalFrame v2.1 (Didelot & Falush, 2007) was used to infer phylogeny from multilocus sequence data. ClonalFrame was run for 3 x 105 iterations with the first 50% iterations discarded as burn-ins. Estimates of recombination rate was also obtained.