Assessment of non-target microbe amplicons
Contaminant pool sequences were then compared against a variety of potential contaminants including Wolbachia and human homologues. Bacterial identity was then refined by phylogenetic placement. To this end, barcodes confirmed as microbial sequences were aligned using the “L-INS-I” algorithm in MAFFT v7.4 (Katoh & Standley, 2013) before using Gblocks (Castresana, 2000) to exclude areas of the alignment with excessive gaps or poor alignment. ModelFinder (Kalyaanamoorthy, Minh, Wong, von Haeseler, & Jermiin, 2017) then determined the TIM3+F+I+G4 model to be used after selection based on default “auto” parameters using the Bayesian information criteria. A maximum likelihood (ML) phylogeny was then estimated with IQTree (Nguyen, Schmidt, Von Haeseler, & Minh, 2015) using an alignment of 561 nucleotides and 1000 ultrafast bootstraps (Hoang, Chernomor, Haeseler, Minh, & Vinh, 2017). The Rickettsiales generaAnaplasma , Neorickettsia , Rickettsia andWolbachia (Supergroups A, B, E, F, H) were included in the analysis as references. Finally, phylogenetic trees were drawn and annotated based on host taxa (order) using the EvolView (He et al., 2016) online tree annotation and visualisation tools.
A determining factor for non-target amplification of bacteria is primer site matching to microbial associates. Subsequently, pairwise homology of the primer set predominantly used for BOLD barcode screening was compared to Rickettsia and Wolbachia COI genes.