2.3 | Repeat structure prediction
The characteristic of repeat sequences was identified by Weblogo
(https://weblogo.berkeley.edu/logo.cgi ) web server. The RNA
secondary structures of the repeat sequences were predicted by the RNA
fold Web server. Meanwhile, the
minimum free energy
(MFE) was calculated.
2.4 |MLST and phylogenetic
analyses
In silico analysis of MLST was performed by MLST 2.0 available on
the CGE website using the seven housekeeping genes (i.e., leuS,
pgi, pgk, phoE, pyrG, rpoB, and fusA ) as queries
(https://cge.cbs.dtu.dk/services/MLST/)
[32]. Phylogenetic tree was constructed by Mega v7.0 using
neighbor-joining method. Multiple sequence alignment was completed by
MUSCLE v3.8.31 [33]. The visualization of the phylogenetic tree was
implemented using iTOL v6 (https://itol.embl.de ).
2.5 | Spacer analysis, protospacer target
identification andprotospacer adjacent
motif ( PAM)
determination
The putative origin of CRISPR spacers was acquired by the CRISPRtarget
web server (http://crispr.otago.ac.nz/CRISPRTarget/crispr
analysis.html ). The 8bp nucleotide sequences from upstream of the
predicted protospacers were extract to predict PAM using Weblogo
(https://weblogo.berkeley.edu/logo.cgi ) web server. The
hierarchical clustering analysis of spacers was performed by the
“seaborn” module in python script. The network of K. variicolaspacers and MGEs from other species were visualized in Gephi with the
layout generated by a combination of Fruchterman Reingold and Noverlap
algorithms (https://github.com/gephi/gephi ). Each pair of species
was connected by at least one spacer-protospacer match.