3.4 Availability of Hist-i-fy:
The source code for Hist-i-fy is available athttps://github.com/dibyansu24-maker/Histifyand can be easily operated on any platform, following the instructions
on GitHub. Moreover, Hist-i-fy enables users to train the model with the
user-defined data.
Hist-i-fy prediction server is hosted onhttps://histify.streamlit.app/using Streamlit, a powerful Python library for building interactive web
applications. Through the intuitive and user-friendly interface of
Streamlit, users can predict sequence modifications in two modes: single
sequence or multiple sequences (to be uploaded as a CSV file containing
sequences and Histidine residue numbers). Streamlit seamlessly
integrates with GitHub, facilitating the hosting process and ensuring a
smooth user experience. The Hist-i-fy model processes the input data and
generates predictions for the respective sequences, providing users with
comprehensive insights into their sequence modifications.
4. CONCLUSION:
The functional characterization of the proteins and their amino acids
lag behind the protein sequence determination. Experimental
characterization is time-consuming and laborious, those can be
complemented by computational methods. Histidine being one of the most
important amino acid at the enzyme active site, functional
characterization would potentially address many biological problems. His
undergoes multiple modifications, sixteen such was reported in the
UNIPROT database. However, only a handful of His (single) function
prediction tools are available. Objective of this study was to predict
multiple histidine modifications (functions) based on protein sequences.
Here we trained and validated eight histidine modifications using
Convoluted Neural Network model. The training dataset was curated from
UNIPORT database. The overall accuracy produced by the CNN model was
75%, although, prediction of individual modifications varies, depending
on the data size, existing sequence pattern etc. The external validity
of the model was tested on an independent phosphorylation dataset
obtained from proteomics study. Accuracy of phosphorylation prediction
(external validation) was 94.1% much higher than that of accuracy
(including all the eight modifications) from internal validation. The
external validity result was comparable from existing His
phosphorylation prediction tool - pHisPred. The final CNN model (termed
as Hist-i-fy) is publicly available as a web application and a
stand-alone program.5. REFERENCE:
[1] M. Cui, C.
Cheng, and L. Zhang, “High-throughput proteomics: a methodological
mini-review,” Lab. Investig. J. Tech. Methods Pathol., vol. 102,
no. 11, pp. 1170–1181, Nov. 2022, doi: 10.1038/s41374-022-00830-7.
[2] T. K. Harris
and G. J. Turner, “Structural Basis of Perturbed pKa Values of
Catalytic Groups in Enzyme Active Sites,” IUBMB Life, vol. 53,
no. 2, pp. 85–98, 2002, doi: 10.1080/15216540211468.
[3] A. Gutteridge
and J. M. Thornton, “Understanding nature’s catalytic toolkit,”Trends Biochem. Sci., vol. 30, no. 11, pp. 622–629, Nov. 2005,
doi: 10.1016/j.tibs.2005.09.006.
[4] A. Bhatnagar
and D. Bandyopadhyay, “Characterization of cysteine thiol modifications
based on protein microenvironments and local secondary structures,”Proteins, vol. 86, no. 2, pp. 192–209, Feb. 2018, doi:
10.1002/prot.25424.
[5] V.
Nallapareddy, S. Bogam, H. Devarakonda, S. Paliwal, and D.
Bandyopadhyay, “DeepCys: Structure-based multiple cysteine function
prediction method trained on deep neural network: Case study on domains
of unknown functions belonging to COX2 domains,” Proteins, vol.
89, no. 7, pp. 745–761, Jul. 2021, doi: 10.1002/prot.26056.
[6] A. Bhatnagar,
M. I. Apostol, and D. Bandyopadhyay, “Amino acid function relates to
its embedded protein microenvironment: A study on disulfide-bridged
cystine,” Proteins Struct. Funct. Bioinforma., vol. 84, no. 11,
pp. 1576–1589, 2016, doi: 10.1002/prot.25101.
[7] Z. Chenet al., “PROSPECT: A web server for predicting protein histidine
phosphorylation sites,” J. Bioinform. Comput. Biol., vol. 18,
Mar. 2020, doi: 10.1142/S0219720020500183.
[8] J. Zhaoet al., “pHisPred: a tool for the identification of histidine
phosphorylation sites by integrating amino acid patterns and
properties,” BMC Bioinformatics, vol. 23, Sep. 2022, doi:
10.1186/s12859-022-04938-x.
[9] A. Passerini,
M. Punta, A. Ceroni, B. Rost, and P. Frasconi, “Identifying cysteines
and histidines in transition-metal-binding sites using support vector
machines and neural networks,” Proteins Struct. Funct.
Bioinforma., vol. 65, no. 2, pp. 305–316, 2006, doi:
10.1002/prot.21135.
[10] S. Lc and M.
M, “Using Peptide Arrays To Discover the Sequence-Specific Acetylation
of the Histidine-Tyrosine Dyad,” Biochemistry, vol. 58, no. 13,
Apr. 2019, doi: 10.1021/acs.biochem.9b00022.
[11] S. Larsenet al., “Mapping Physiological ADP-Ribosylation Using Activated
Ion Electron Transfer Dissociation,” Cell Rep., vol. 32, p.
108176, Sep. 2020, doi: 10.1016/j.celrep.2020.108176.
[12] H. Minneeet al., “Mimetics of ADP-ribosylated histidine through
copper(I)-catalyzed click chemistry,” Org. Lett., vol. 24, no.
21, pp. 3776–3780, May 2022, doi: 10.1021/acs.orglett.2c01300.
[13] D. Dutta, C.
Mandal, and C. Mandal, “Unusual glycosylation of proteins: Beyond the
universal sequon and other amino acids,” Biochim. Biophys. Acta
BBA - Gen. Subj., vol. 1861, no. 12, pp. 3096–3108, Dec. 2017, doi:
10.1016/j.bbagen.2017.08.025.
[14] G. Zurlo, J.
Guo, M. Takada, W. Wei, and Q. Zhang, “New Insights into Protein
Hydroxylation and Its Important Role in Human Diseases,” Biochim.
Biophys. Acta, vol. 1866, no. 2, pp. 208–220, Dec. 2016, doi:
10.1016/j.bbcan.2016.09.004.
[15] S.
Markolovic, S. E. Wilkins, and C. J. Schofield, “Protein Hydroxylation
Catalyzed by 2-Oxoglutarate-dependent Oxygenases,” J. Biol.
Chem., vol. 290, no. 34, pp. 20712–20722, Aug. 2015, doi:
10.1074/jbc.R115.662627.
[16] M. E.
Jakobsson, “Enzymology and significance of protein histidine
methylation,” J. Biol. Chem., vol. 297, no. 4, Oct. 2021, doi:
10.1016/j.jbc.2021.101130.
[17] C.-F. Xuet al., “Discovery and Characterization of Histidine Oxidation
Initiated Cross-links in an IgG1 Monoclonal Antibody,” Anal.
Chem., vol. 89, no. 15, pp. 7915–7923, Aug. 2017, doi:
10.1021/acs.analchem.7b00860.
[18] C. Schöneich,
“Reactive oxygen species and biological aging: a mechanistic
approach,” Exp. Gerontol., vol. 34, no. 1, pp. 19–34, Jan.
1999, doi: 10.1016/S0531-5565(98)00066-7.
[19] J. D.
Bridgewater, R. Srikanth, J. Lim, and R. W. Vachet, “The Effect of
Histidine Oxidation on the Dissociation Patterns of Peptide Ions,”J. Am. Soc. Mass Spectrom., vol. 18, no. 3, pp. 553–562, Mar.
2007, doi: 10.1016/j.jasms.2006.11.001.
[20] J. Zhaoet al., “HisPhosSite: A comprehensive database of histidine
phosphorylated proteins and sites,” J. Proteomics, vol. 243, p.
104262, Jul. 2021, doi: 10.1016/j.jprot.2021.104262.
[21] P. M.
Wolanin, P. A. Thomason, and J. B. Stock, “Histidine protein kinases:
key signal transducers outside the animal kingdom,” Genome
Biol., vol. 3, no. 10, p. reviews3013.1-reviews3013.8, 2002, doi:
10.1186/gb-2002-3-10-reviews3013.
[22] Z. Duet al., “Highly Conserved Histidine Plays a Dual Catalytic Role
in Protein Splicing: A pKa Shift Mechanism,” J. Am. Chem. Soc.,
vol. 131, no. 32, pp. 11581–11589, Aug. 2009, doi: 10.1021/ja904318w.
[23] J. R.
Herrmann, J. C. Panitz, S. Unterreitmeier, A. Fuchs, D. Frishman, and D.
Langosch, “Complex Patterns of Histidine, Hydroxylated Amino Acids and
the GxxxG Motif Mediate High-affinity Transmembrane Domain
Interactions,” J. Mol. Biol., vol. 385, no. 3, pp. 912–923,
Jan. 2009, doi: 10.1016/j.jmb.2008.10.058.
[24] M. Lvet al., “METTL9 mediated N1-histidine methylation of zinc
transporters is required for tumor growth,” Protein Cell, vol.
12, no. 12, pp. 965–970, Dec. 2021, doi: 10.1007/s13238-021-00857-4.
[25] L. D. L.
Jedlicka et al., “Increased chemical acetylation of peptides and
proteins in rats after daily ingestion of diacetyl analyzed by
Nano-LC-MS/MS,” PeerJ, vol. 6, p. e4688, Apr. 2018, doi:
10.7717/peerj.4688.
[26] K. Terashimaet al., “Impurity effects on electron–mode coupling in
high-temperature superconductors,” Nat. Phys., vol. 2, no. 1,
Art. no. 1, Jan. 2006, doi: 10.1038/nphys200.
[27] The UniProt
Consortium, “UniProt: the universal protein knowledgebase in 2021,”Nucleic Acids Res., vol. 49, no. D1, pp. D480–D489, Jan. 2021,
doi: 10.1093/nar/gkaa1100.
[28] C. M. Potel,
M.-H. Lin, A. J. R. Heck, and S. Lemeer, “Widespread bacterial protein
histidine phosphorylation revealed by mass spectrometry-based
proteomics,” Nat. Methods, vol. 15, no. 3, pp. 187–190, Mar.
2018, doi: 10.1038/nmeth.4580.
[29]
“Tokenization and Text Data Preparation with TensorFlow & Keras,”KDnuggets.
https://www.kdnuggets.com/tokenization-and-text-data-preparation-with-tensorflow-keras.html
(accessed Apr. 14, 2023).
[30]
“sklearn.preprocessing.LabelBinarizer,” scikit-learn.
https://scikit-learn/stable/modules/generated/sklearn.preprocessing.LabelBinarizer.html
(accessed Apr. 14, 2023).