2.1.3 Glycosylation
Different amino acids may undergo different types of glycosylation,
namely, C-linked, N-linked, O-linked, or S-linked
[13]. Histidine
undergoes N-linked glycosylation.
2.1.4 Hydroxylation
Protein hydroxylation, a post-translational modification, is carried out
by 2-oxoglutarate-dependent dioxygenases. This post-translational
modification can be induced by hypoxia-induced-factor alpha (HIF-a) on
proline [14].
Hydroxylation may also involve protein-protein interactions and
downstream signalling. Apart from proline, lysine, asparagine,
aspartate, and histidine can also undergo hydroxylation modification
[15].
2.1.5 Methylation
The actin and myosin proteins undergo the post-translational
modification (PTM) of histidine methylation. There are two different
locations where it can happen: 1-methyl histidine (1MeH) and 3-methyl
histidine (3MeH)
[16].
2.1.6 Oxidation
Under unusual or stressed conditions histidine undergoes oxidation to
2-oxo-histidine (2-oxo-His) (Figure 1). Photo-induced oxidation of
Histidine leading to various cross-links, including intact His, Lys and
Cys, was observed in high-molecular weight (HMW) fractions of monoclonal
anti-bodies [17].
Oxidation of His residue is also observed in proteins from cells
undergoing oxidative stress
[18]. The
2-oxo-His changes the dissociation pattern of peptide ions in
Mass-spectroscopy studies
[19].
2.1.7 Phosphorylation
His phosphorylation is crucial step in various cellular processes, such
as signal transduction, cell cycle, proliferation, differentiation, and
apoptosis, Phosphorylated His contributes 6% to all the phosphorylated
amino acids. However, phosphorylation of His is less explored compared
to phosphorylated serine, threonine and tyrosine. Recently a
consolidated database on phosphorylated His (HisPhosSite) is available
[20]. Histidine
Kinase (HK)s is one of the classical non-animal kingdom kinases that
phosphorylate His, although, in a 2-step manner - i) transfer phosphate
from ATP to His and ii) then transfer the phosphate to an aspartate
residue [21].
2.1.8 Protein Splicing
Protein splicing is triggered via acid-base catalysis that involves
multiple conserved His at the active site. Histidine probably plays dual
role in protein splicing, first as a general base to start acyl shift
splicing and next as a general acid to break the scissile bond at the
N-terminal splicing junction
[22].
2.2 Sequence signatures around different His post-translational
modifications:
Many of the His post-translational modifications were identified with
specific sequence signatures or motifs. For example, His hydroxylation
motif is a part of Hydrogen-bond (H-bond) cluster that is brought into
the register by GXXG motif
[23]. For His
methylation, the common motif observed in short methylated peptides was
GHXHXH [24].
Histidine acetylation motif deduced from mass spectrometry data based on
diacetyl-fed rat lung proteins was GXPGXXGHXGXXG
[25]. However,
some of the Histidine post-translation modifications do not carry
sequence signatures. For example, no specific sequence motif is reported
for His glycosylation. For His phosphorylation, no clear sequence motif
was identified
[26].
2.3 Training dataset generation for Histidine post-translational
modifications
There are eight His post-translational modifications (Figure 1)
annotated in this work based on the availability of protein sequences
from the UniProt database
[27]. From the
“Keyword” subsection of UniProt, category name ”PTM” was selected to
track all possible post-translational modifications. The text filters
(not case sensitive) – “His”, or “Histidine” were used to identify
the experimentally annotated His functions from the PTM category,
curated on November 2022. A total of sixteen modifications were
identified, some of those have very few data points. Finally, eight
modifications were selected for the training dataset with a number of
data points more than or equal to twenty (Table 1).