Possible evolutionary development of the family of bacterial 30S
ribosomal S1 proteins
The problem of understanding the nature of protein repeats, the
corresponding functions for each repeat, and their evolution is still
unclear. These repeats evolved from a common ancestor, which necessarily
contained a single repeat 50. Some authors suggested
that the common ancestor of the family was indeed a single repeat that
formed homo-oligomers for effective functional activity51. The homo-oligomeric structure of an ancestor may
reflect the intrachain repeating structure of its modern homologue, with
the exception of its multi-chain character. However, there are examples
of homologous multiple repeats, which are formed both from oligomers
with single repeats and from one chain of several repeats (Andrade et
al., 2001).
For the investigated bacterial proteins, the maximum number of repeats
of the S1 domain (six) is sufficient to perform all the necessary
functions. The third domain in this group has the highest identity
(68%) among other domains. In addition, this domain has the highest
identity with the S1 domain from PNPase (E. coli ) and the S1
domains from S1 single domain proteins (Tenericutes, Mollicutes)9, and the RNA binding site is formed by five
residues: F19, L22, H34, N64, and R68, which once again confirms the
uniqueness of this repeat and allows us to consider it as the strongest
RNA binding site. Thus, the central part of proteins (third and fourth
domains) appears to be vital for the activity and functionality of these
proteins. This suggestion is consistent with experimental data. One of
the well-studied proteins with six repeats of the S1 domain is the
bacterial 30S ribosomal protein S1 from E. coli . It was shown
that cutting off one S1 domain from the C-terminus or two S1 domains
from the N-terminus of the protein reduces only the efficiency of the
protein functions, but not its functionality 14,41.
As mentioned above, the Proteobacteria consists of 55% of all proteins
S1 (Figure 1b). Within this, phylogenetic classes are represented by a
different number of sequences and structural S1 domains (Figure 4).
Thus, Acidithiobacillia and Epsilonproteobacteria have six S1 domains,
Alpha – and Deltaproteobacteria consist of five or six S1 domains. Note
that Epsilonproteobacteria is considered to be the oldest class in this
phylum 29,52. The Oligoflexia class is characterized
by the presence of four or six S1 domains; for Beta and Gamma
proteobacteria, the number of S1 domains ranges from one to six.
Betaproteobacteria are evolutionary most closely related to
Gamma-proteobacteria and Acidithiobacillia, and together they make up a
taxon called Chromatibacteria 53. However, the
Acidithiobacillales class was previously classified as part of the
Gamma-proteobacteria 54. Our data also confirm the
separation of this class into a separate one for a constant number of
structural S1 domains (Figure 4). Phylogenetic analyses of various
proteins suggest that that Beta-proteobacteria and Gamma-proteobacteria
branched out later than most other phyla of Bacteria along with
Proteobacteria 55,56.
Alphaproteobacteria branched out at the same time as Deltaproteobacteria55,56. Note that these classes have five and six
domains, with Beta-proteobacteria and Gamma-proteobacteria having
different numbers of S1 domains. According to our data, these classes
within Proteobacteria (in addition to the Actinobacteria, Bacteriodites
and Firmicutes phyla) have the greatest diversity in the number of S1
domains in comparison with other phyla, where this number constantly or
insignificantly changes. The specific relationship of the phylum
Aquificae to the Epsilonproteobacteria is supported by the conserved
indel signature in inorganic pyrophosphatase, which is uniquely found in
the species of the two phyla 57. In58, the authors also suggested that Aquificae are
closely related to Proteobacteria. This closeness is due to frequent
horizontal gene transfer due to common ecological niches. According to
our data, bacteria from the phylum Aquificae and class
Epsilonproteobacteria have strictly six S1 domains. The evolutionary
development of representatives of the Acidobacteria phylum is often
considered to be associated with Alphaproteobacteria59,60 due to the fact that both bacteria belonging to
these phyla were associated with a copiotrophic lifestyles61. According to our data, the phyla Acidobacteria and
the class Alphaproteobacteria have six S1 domains. The evolutionary
independent development of such phyla as Caldiserica, Deferribacteres,
Fusobacteria, Spirochaetes, Nitrospirae, Nitrospinae/Tectomicrobia is
apparently reflected in the constant number of structural S1 domains in
these bacteria. Moreover, the phylum Spirochaetes in the literature is
considered a phylogenetically ancient and distinct group of
microorganisms 62. This phylum contains six S1 domains
(Figure 4).
As mentioned above, the analysis of 16S rRNA and characteristic
conserved indels in some proteins is used to group the phyla
Planctomycetes, Verrucomicrobia, Chlamydiae in the PVC clan28. Bacteria of the Chlamydiae and Verrucomicrobia
phyla generally contain six S1 domains, while Planctomycetes can have
four, five, and six S1 domains (Figure 4). According to some published
data, the genome of organisms of the phylum Planctomycetes, in
comparison with other phyla of superphylum PVC, is the largest and most
susceptible to evolutionary changes 63. Phyla
Clamydiae and Verrucomicrobia are considered evolutionarily closer to
each other 64.
The FCB group is a superphylum of bacteria named after the main member
phyla Fibrobacteres, Chlorobi, and Bacteroidetes. Some authors also
include the phyla Gemmatimonadates and Ignavibacteriae in this group27. It should be noted, that these phyla on
phylogenetic trees are often at the same level, while the phylum
Fibrobacteres is considered a phylogenetically more ancient group. Our
data show that the ribosomal S1 protein in this group almost always
contains six S1 domains (constant number for the Gemmatimonadates,
Ignavibacteriae, Fibrobacteres, Chlorobi and class Bacteroidia phyla).
The class Cytophagia has one, four, and six domains within the phylum
Bacteroidetes (Figure 4).
Phylum Bacteroidetes, along with Proteobacteria, Firmicutes, and
Actinobacteria, are also among the most common bacterial groups in the
rhizosphere 65. They have been found in soil samples
from various locations, including cultivated fields, greenhouse soils,
and unexploited areas 66. Note that for these phyla,
the number of structural S1 domains can vary from one to six (Figure 4).
Terrabacteria are a supergroup containing the Actinobacteria,
Tenerecutes, and Firmicutes phyla, as well as the Cyanobacteria,
Chloroflexi, and Deinococcus-Thermus phyla 29,52. It
is widely accepted that oxygenic photosynthesis devoloped in ancient
lineages of Cyanobacterial 67, but very little is
known about the nature and evolutionary history of anoxygenic
phototrophy, and much of the understanding is based on assumptions and
hypotheses based on few existing bacterial taxa, in which this
metabolism occurs. However, a number of studies have argued that one of
the earliest forms of anoxygenic photosynthesis arose in the Chloroflexi
phylum before the invention of oxygenic photosynthesis during the
Archean Eon 68,69. Our data revealed three S1 domains
in the phylum Cyanobacteria and four S1 domains in the phylum
Chloroflexi. According to another version, the phyla Actinobacteria and
Chloroflexi are more evolutionarily close 32. Note,
that Actinobacteria predominantly have four S1 domains. Evolutionary
close to the phyla Actinobacteria, Cyanobacteria, Chloroflexi, and
Deinococcus-Thermus, and the phylum Firmicutes according to our data, it
also predominantly has four S1 domains 70,71.
meanwhile, according to 32,70 the phylum
Deinococcus-Thermus (five S1 domains) is more ancient than other phyla
in the supergroup Terrabacteria.
Note that the bacterial 30S ribosomal S1 protein from the parasitic
bacteria Mollicutes (phylum Tenerecutes) effectively performs the basic
functions of RNA binding 40. There is an assumption in
the literature that mycoplasmas (Mollicutes) are a regressive branch of
the evolution of some Gram-positive bacteria or Firmicutes72. This hypothesis was confirmed experimentally and
is considered in two possible variants: all mycoplasmas originate either
from a common ancestor with Gram-positive bacteria, or from different
bacteria 72. Based on a comparison of the 16S rRNA
oligonucleotide sequences of several species of mycoplasmas and
Gram-positive bacteria from the genera Clostridium, Bacillus,
Lactobacillus, and Streptococcus, a reasonable assumption was made about
their evolutionary relationship with the phylum Firmicutes73,74. A more detailed analysis of 16S RNA sequences
showed that mycoplasmas are phylogenetically closest to clostridia75. In turn, the most likely ancestors of clostridia
(Firmicutes) are Gram-positive bacteria with a low G+C content in their
DNA. According to our data, the 30S ribosomal S1 protein from the phylum
Tenerecutes has one S1 domain.
Summarizing all the above, it can be argued that, firstly, the number of
structural S1 domains in bacteria of different phyla may coincide during
symbiotic life and secondly, more phylogenetic ancient divisions have a
greater number of structural domains (basically six). Moreover, the
earlier in the phylogenetic respect the microorganism, the greater the
likelihood of decreasing and ranking the number of structural S1 domains
in it.