Recombination events between endosymbiont strains:
Recombination in endosymbiont genomes is pervasive and such events significantly add to the diversification of these bacteria (Jiggins, von Der Schulenburg, Hurst, & Majerus, 2001). To check for incidence of recombination, we first analyzed the overall rates of recombination in the Wolbachia sequences with both ClonalFrame and RDP4. Both analyses showed a rate of nucleotide substitutions due to recombination/point mutation (r/m ) of around 2.4 (95% confidence interval between 1.4- 3.7) which represents intermediate rates of recombination (Vos & Didelot, 2009). This also indicates that recombination introduces twice more nucleotide substitutions as compared to point mutation in the Wolbachia dataset. Unsurprisingly, the Φ test in SplitsTree also showed significant evidence of recombination (p <0.001) for the same Wolbachia sequences (Figure S3). However, for Cardinium and Arsenophonus , RDP4 did not indicate any evidence of recombination. This was probably due to the use of a single gene (16S rRNA gene) for these two bacteria.
To enumerate the recombination events within the Wolbachiasequences, we first looked at phylogenetic trees to check if single gene phylogenies of all the 5 MLST genes (Figure S4) differ significantly with the concatenated MLST trees (Figure 4). The next level of analysis was to use sliding window algorithms in RDP4 to locate recombination breakpoints wherever possible. All of these recombination events were then evaluated and confirmed manually. These analyses yielded several possible recombination events elaborated below.
Recombination between supergroups : Several cases of acquisition of a gene or gene segment from different supergroup were detected. Phylogenetic and network analysis of concatenated MLST dataset (Figure 4) showed Wolbachia ST-N2, infecting morph0343 (Hymenoptera- Encyrtidae), to cluster with B supergroup. But individual gene trees revealed that the coxA fragment of ST-N2 clusters with A supergroup (Figure 4) and has the allelic profile of 7. This phylogenetic disparity suggests that coxA gene of ST-N2 was acquired via recombination from a supergroup A Wolbachia . Curiously enough, coxA allele 7 is also found in two otherWolbachia infected hosts, ST-565 of morph0294 (Hymenoptera- Platygastridae) and ST-544 of morph0076 (Araneae- Orthobula ), both with supergroup A infections (Table 1). Although it is impossible to know which Wolbachia strains originally underwent recombination and gave rise to the recombinant allele 7 of coxA , yet the presence of the same allele within the community suggests that the recombination event could have involved members within this ecological community.
Similarly, another case of recombination was observed where a B supergroup Wolbachia ST-560, of morph0214 (Hemiptera-Muellerianella ), had the coxA gene fragment (allele profile 2) from the A supergroup (Figure 4). This recombinantcoxA allele 2 also share sequence similarity with ST-550 and ST-571, where coxA alleles are different by only two base pairs (coxA allele profile 305) indicating that perhaps this is also another case of recombination happening within the community.
Another case of recombination between supergroups was found with another MLST gene, gatB , but between supergroups A and F. TheWolbachia ST-552 (supergroup F), infecting morph0148 (Araneae-Zelotes ), had a recombinant gatB , where the last 190 bp fragment came from the A supergroup. As the concatenated MLST tree (Figure 4) shows, ST-552 clusters with F supergroup, but the individualgatB gene tree shows it to be from the A supergroup. This 190 bp fragment differ by only one base pair with ST-544 infecting morph0076 (Araneae- Orthobula ). This is also indicative of a possible recombination between these two Wolbachia STs belonging to two different supergroups.
Recombination within supergroups : The pervasive recombination necessitated the development of the MLST scheme for Wolbachia(Baldo et al., 2006) as single gene phylogenies were unable to properly represent the evolutionary history of a particular strain. In this scheme, alleles of any of the five different genes are given the same nomenclature if they share sequence identity. As table 1 shows, many of the morphospecies also share the same alleles. In fact, instead of the maximum possible number of unique alleles (180) that could have been present across the 5 MLST loci of the 36 infected morphospecies, there is only 136. This is indicative of acquisition of same alleles by recombination and are therefore, examples of within-supergroup recombination events whereby MLST fragments are exchanged across endosymbionts.
Next, we tried to identify intergenic (i.e ., within a particular MLST gene) recombination happening within a supergroup. Since, this detection is dependent on the algorithms present in RDP4 these estimates are inherently conservative. Most of these algorithms scans for above than expected sequence divergence in the given dataset. Therefore, recombination events happening between closely related strains and/or between regions with low variation will not be recorded as significant events by these algorithms.
There can be two types of intergenic recombination events. First, different MLST fragments (e.g ., between coxA andgatB of two different strains) can combine to form a chimeric gene and secondly, recombination can happen within the same MLST genes (e.g ., within coxA of two different strains). Our analysis did not find any examples of the former. This is unsurprising as all the MLST fragments are housekeeping genes and such chimeric variants will be under strong negative selection. However, eight instances of recombination within same MLST gene were found (Table 2), all within supergroup A.