Figure Legends
Figure 1a. Schematic of DLP: A host interactor protein (red) and pathogen protein (orange) that interacts with the same host protein (blue) share a similar domain X (yellow). In this way, the pathogen protein can mimic the domain of the host interactor protein and competes with it to bind to the host protein, thus causing the disease.b. Schematic of MLP: A host interactor protein (red) and pathogen protein (yellow) that interacts with the same host protein (blue) share a similar motif X (orange). In this way the pathogen protein can mimic the motif of the host interactor protein and competes with it to bind to the host protein, thus causing the disease.
Figure 2. Database schematic: The basic pipeline for search options is represented on the left and the basic workflow as well as the count of entities in the database are shown on the right.
Figure 3. Global and local imitation of host proteins by pathogens : Graph depicting the total interactions and the interactions characterized by mimicked domains and motifs for different categories of pathogen.
Figure 4 a. Frequently occurring domains: Bar graph showing the frequency of top 10 domains mimicked by pathogens in the database. The description of the domains is as follows: PHA03247- large tegument protein UL36; Smc- Chromosome segregation ATPase; SMC_prok_B- chromosome segregation protein SMC: common bacterial type; PKc- Catalytic domain of Protein Kinases; STKc_PknB_like- Catalytic domain of bacterial Serine/Threonine kinases, PknB and similar proteins; STKc_CMGC- Catalytic domain of CMGC family Serine/Threonine Kinases; STKc_CAMK- The catalytic domain of CAMK family Serine/Threonine Kinases; STKc_AMPK-like- Catalytic domain of AMP-activated protein kinase-like Serine/Threonine Kinases; STKc_PDK1- Catalytic domain of the Serine/Threonine Kinase, Phosphoinositide-dependent kinase 1; STKc_MLCK-like- Catalytic kinase domain of Myosin Light Chain Kinase-like Serine/Threonine Kinases. b. Frequently occurring motifs: Bar graph showing the frequency of top 10 motifs mimicked by pathogens in the database. The description of the motifs is as follows: PKC_PHOSPHO_SITE- Protein kinase C phosphorylation site; CK2_PHOPHO_SITE- Casein kinase II phosphorylation site; MYRISTYL- N-myristoylation site; ASN_GLYCOSYLATION- N-glycosylation site; CAMP_PHOSPHO_SITE- cAMP- and cGMP-dependent protein kinase phosphorylation site; AMIDATION- Amidation site; TYR_PHOSPHO_SITE_1- Tyrosine kinase phosphorylation site 1; TYR_PHOSPHO_SITE_2- Tyrosine kinase phosphorylation site 2; RGD- Cell attachment sequence; PRO_RICH- Proline-rich region profile.
Figure 5 Enriched pathways in host proteins: Bar graph depicting the enriched pathways of host proteins in the database along with their enrichment percentage.
Figure 6 a. The ImitateDB web interface: Expanded view of the search panel of the web interface showing the steps to query the ImitateDB database. b.Receive large result files by email: Expanded view of the mailer popped up the on the ImitateDB interface.