Metagenomic Analysis of Microbes
To better understand the microbiome associated with N. riversi ,
we first examined the taxonomic assignment for the most abundant
microbial reads using Autometa (Miller et al. 2019). To
generate an input coverage table for Autometa, we used
Minimap2 to calculate coverage for the long-read data against
the CANU assembly. Then, we ran the Autometa pipeline using a
contig length cutoff of 500 bp. While we ran this program with all reads
(beetle and non-targeted reads), Autometa separates eukaryotic
reads from bacterial reads during the clustering step. We then used
taxonomic identification and clustering to identify four unique bins
(Spiroplasma, Enterobacter, Acinetobacter, Rhodococcus ). Using
BUSCO v4.14 and CheckM (Parks et al. 2015), we assessed
the quality of the four bins. Specifically, we ran BUSCO in genome mode
using the auto lineage selector (–auto-lineage-prok) and we used the
lineage_wf CheckM command. From this analysis we determined
that the contigs matching to Spiroplasma had high enough quality
to continue with downstream functional analyses. The Spiroplasmasp. NR genome assembly has been made publicly available on NCBI (WGS
Accession: CP065192-CP065193).
To understand the function of Spiroplasma in its host, we
examined its phylogenetic placement relative to other Spiroplasmataxa, annotated its genome for functional elements, and also examined
whether it was transcriptionally active at different life-stages in the
RNAseq data. Previous studies have shown that Spiroplasma can
play a role in host defense against nematodes and parasitoids through
ribosome-inactivating proteins (RIPs, Ballinger & Perlman 2019), or
conversely act as a reproductive parasite when the male-killingspaid gene is present (Harumoto & Lemaitre 2018). First, the
phylogenetic relationship of Spiroplasma sp. NR was assessed
using the 16S rRNA gene. A total of 38 Spiroplasma accessions
were obtained from GenBank, and 1,286 bp of the 16S gene were
aligned with MUSCLE v3.8.31 (Edgar 2004). An evolutionary tree was
inferred using Maximum Likelihood in RAxML (Stamatakis 2014),
with the GTRCAT substitution model and 1,000 bootstrap replicates to
estimate support values. Next, we annotated the Spiroplasmagenome using prokka (Seemann 2014), with settings for the
Bacteria and genetic code 4. To further functionally annotate the
genome, we performed several annotations with 1) the KEGG
Automatic Annotation Server, 2) antiSMASH v5.0 (Blin et
al. 2019), 3) CRISPRfinder (Grissa et al. 2007), and 4)
DIAMOND blastx against custom databases and the UniProt
SwissProt and TrEMBL databases (accessed September 8, 2020;
O’Donovan et al. 2002).We kept only the top 5% of hits (–top 5)
that had an E value below 1e-05 for each query sequence. We searched for
UniProt accession numbers corresponding to 156 RIPs and the
Ankyrin-rich Spaid protein from S. poulsonii (Accession
A0A2R6Y5P0). Due to ambiguous results, we also used 1) blastn
and blastx against the spaid nucleotide and peptide
sequences on GenBank (Accessions MG837001.1 and AWJ64280.1), as
well as 2) metabolisHMM (McDaniel et al. 2019) searched
using the Ankyrin repeat (PF00023) and OTU-like (PF02338) Pfam
hmmer models. The resulting hits were compared to NCBI’s non-redundant
nucleotide database using blastx. Finally, to examine ifSpiroplasma was transcriptionally active at different life-stages
and in different sexes, we used the Kaiju webserver (Menzelet al. 2016) to assign taxonomic classifications to microbial
reads, using default parameters. We recorded the number of reads mapping
to domain Bacteria, class Mollicutes, and genus Spiroplasma . In
addition, we used kallisto v0.43.1 (Bray et al. 2016) to
pseudo-align all non-beetle reads against the Spiroplasma genome
from this study.