Motif mimicry
Out of the 5,569 pathogen proteins from 630 pathogens, 5,255 proteins
from 610 pathogens made MLPs with the host interactor proteins as
indicated in the schematic Figure 1. However, only 239 unique motifs
were found to be mimicked by pathogens. Since each pathogen can mimic
motifs from multiple interactors, the largest number of MLPs were found
for the Polymerase basic protein 2 from Influenza A virus strain
A/Wilson-Smith/1933 H1N1 (A/Wilson-Smith/33/H1N1), with 35,385
MLPs. The average number of MLPs for a protein is 732.Amongst viral pathogens, A/Wilson-Smith/33/H1N1 had the maximum
MLPs whereas in the bacterial interactome, Yersinia pestis had
the maximum MLPs. The Top 10 pathogens by the count of MLPs are listed
in Table 4.
Table 2 and 4 showed that S. cerevisiae S288c had the maximum
count of DLPs and MLPs even while the total number of reported HP-PPIs
were very low in comparison with virus or bacteria. This can be
attributed to the fact that yeasts, being eukaryotes are quite similar
to humans in terms of genes and other cellular pathways. It has been
observed that the genes that regulate cellular processes in humans have
equivalents that control cell division in yeasts as well which makes it
very easy for pathogenic yeast species to alter the host cellular
machinery (63). Therefore, this study has unravelled the potential
mimicry candidates in fungal pathogens which was not well established
till now.
The total count for the top 10 most frequently occurring motifs in the
database is shown in Figure 4b. The predominance of phosphorylation
sites for Protein kinase C (PKC) phosphorylation site and casein kinase
II (CK2) phosphorylation site can be observed from the figure. PKC and
CK2 family of serine/threonine kinases plays essential roles in
hijacking multiple signalling pathways in humans leading to many viral
infections (64). Tyrosine phosphorylation has been proved to be an
important process for pathogenesis as well as immune responses after the
underlying revelation of a bacterial tyrosine phosphatase (65). There
have been instances where both extracellular as well as intracellular
bacteria secreted several proteins that mimicked the function of their
analogous eukaryotic like proteins and hijacked the tyrosine
phosphorylation pathway (66). Additionally, sites for N-myristoylation,
Amidation site, and N-glycosylation could be seen in all the organism
categories. Several instances have showed the contribution of post
translational modification (PTM) sites in microbial infection and
cellular processes (67, 68).
The top 10 most frequent motifs in every pathogen category are listed in
Table 5. N- glycosylation was a frequently occurring motif known
to be an important modification used by several pathogen proteins
(specifically viral glycoproteins) to evade the human immune system (69,
70). The envelope proteins of viruses like HIV-1 are heavily
glycosylated and can provide camouflage against the human proteins,
leading to alteration of immune recognition (71, 72). Protein
N-myristoylation site is another conserved PTM of proteins involved in a
variety of different physiological processes like cell proliferation and
differentiation, cell survival, and cell death(73). Also, several
myristoylated proteins have been found to have prominent roles in
cellular signalling pathways (74) and the myristoylation motif has been
found to be mimicked by viral and bacterial proteins (25, 75).
Additionally, several other commonly mimicked motifs in our data were
ABC transporters family signature motif, Q motif, ATP/GTP-binding site
motif A (P-loop), arginine-rich motif, ubiquitination site and prenyl
group binding site. The ABC transporters family signature motif is a
conserved sequence (LSGGQ) present in the Nucleotide binding domain
(NBD) of all ABC transporters and is primarily required for substrate
transport (76). The pathogens can mimic this motif to disturb the
transportation pathways of the host. Q motif is a part of conserved
helicases (involved in DNA dynamics) (77) and might help the pathogens
to hijack the host machinery associated with DNA replication,
recombination, transcription, and repair. The highlight table depicting
the number of MLPs characterized by top 20 mimicked motifs for the top
20 pathogens is shown as Supplementary Table S5