Torix Rickettsia is the most common bacterial contaminant produced during barcoding projects
Out of 3,817 sequences considered contaminants, 1,126 of these were deemed by BOLD to be bacterial in origin (Figure 1, Table S4). Phylogenetic placement supported the correct designation of these sequences as of microbial origin (Figure 2). The dominant genus wasRickettsia with 753 (66.9%) amplifications, compared toWolbachia with 306 (27.2%). Of the remaining 67 non-target sequences, 16 formed a monophyletic group with other Anaplasmataceae and 51 were undesignated proteobacteria. When considering the 184,585 specimens in the total project, this analysis gave an overallRickettsia and Wolbachia prevalence of 0.41% and 0.17% respectively within the dataset. Through later access to the 55,366 representative data subset from where the contaminants originated, further unique bacteria contaminants were also detected (possibly missed by BOLD’s automated contaminant filtering system). This suggests these prevalences are conservative estimates.
BOLD Rickettsia contaminants were dominated by amplicons from the Torix group of Rickettsia (716/753; 95.1%) (Figure 3). The remaining 37 Rickettsia clustered with Transitional/Spotted Fever (n=15), Belli (n=9), Rhyzobius (n=1) groups, while 12 sequences formed two unique clades (Table S4). Across arthropod hosts: 292 (38.8%) were derived from Hymenoptera; 189 (25.1%) from Diptera; 177 from Hemiptera (23.5%); 41 from Psocoptera (5.4%); 40 from Coleoptera (5.3%); 7 from Arachnida (0.9%); 4 from Trichoptera (0.5%); and single cases of Thysanoptera, Diplopoda and Dermaptera (0.1% each). Mapping the 753Rickettsia to collection site (Figure S1) revealed arthropod infections predominantly from Canada with other locations in South/Central America, Europe, Africa and Asia.
We observed that two sets of COI primers were responsible for 99% of Rickettsia amplifications (Table S5) with a majority (89%) amplifying with the primer combination C_LepFolF/C_LepFolR (Hernández-Triana et al., 2014). Torix Rickettsia COI showed a stronger match to these primers at the 3’ end (the site responsible for efficient primer annealing) compared to Wolbachia and otherRickettsia groups. Whilst all contained a SNP at the 3’ priming end of C_LepFolR, Torix Rickettsia (Rickettsiaendosymbiont of Culicoides newsteadi ; MWZE00000000) was the only sequence to not contain a similar SNP at the 3’ priming site of C_LepFolF (Tables S6.1 and S6.2).