Results and Discussion

The overall structure of human PPA1 is similar to other Family I PPases

The crystal structure of human PPA1 was determined at a resolution of 2.39 Å using the molecular displacement (MR) method. A monomeric structure of E-PPase was chosen as the search model to minimize bias on the structure determination. The sequence of human PPA1 is much longer than that of E-PPase (289 residues vs 176 residues). The two sequences are only marginally homologous (residues 44-194 in human PPA1 matches residues 16-143 with 27% identity, 41% similarity, and 18% gap). It is also known that prokaryotic and eukaryotic PPases assume different oligomerization states in the crystal structures. The fact that the use of a monomeric E-PPase structure as a search model for MR readily leads to a clear solution of the structure of human PPA1 indicates that the PPA1 and E-PPase share highly homologous core sgtructures.
The polypeptide chain of human PPA1 adopts a single domain globular fold consisting of seventeen β-strands, four α-helices, and four 310-helices (Figures 1 and 2). The core of the globular fold is composed of a 5-stranded antiparallel β-barrel (strands 5, 11, 13, 14 and 15) and a 5-stranded antiparallel β-sheet (strands 1, 2, 3, 6 and 7) (Figure 2A). The β-barrel and β-sheet are packed closely together. Other parts of the molecules, including the four α-helices, four 310-helices, and five short β-strands (strands 4, 8, 9, 10, 12, 16, and 17) surround the β-core structure. A parallel 2-stranded β-sheet formed by the two short strands β10 and β17 helps to anchor the C-terminus of the polypeptide to the protein fold.
Among the soluble PPases with known structures, human PPA1 share the highest sequence homology with the PPases from S. japonicum andS. cerevisiae (about 70% similarity, see Figure 1). The structures of the PPases from these three species are similar. Superimposition of the monomeric human PPA1 structure to the monomeric structure of PPase from S. japonicum (PDB code 4QLZ) and S. cerevisiae (PDB code 2IHP) gave a RMSD of 0.98 Å (based on 3032 common atoms) and 0.97 Å (based on 2981 common atoms) respectively. Figure 2B shows a superimposition of the monomeric structures of human PPA1 and Y-PPase. The two structures superimpose well in most parts of the molecules, including the active site. Some differences exist in the α4-helix, β8, β9, β10, β17 regions, as well as some connecting loops. In comparison with the structure of E-PPase (PDB code 4UM4), the β-barrel, β6, β7, α1-helix, α2-helix, and the N-terminal portion of α3-helix of human PPA1 have their counterparts in the E-PPase structure (RMSD=2.60 Å over 640 common atoms). These portions of the structure are common in all known structures of soluble Family I PPases.
Human PPA1 and other three eukaryotic PPases (Sj-PPase, Y-PPase, and Pf-PPase) have a C-terminal extension (after the α3-helix) sequence compared to other Family I PPases (Figure 1). This C-terminal extension sequence assumes a similar structure in human PPA1, Sj-PPase, and Y-PPase (Figure 2B), and is involved in the homodimerization of the PPases (see below). The C-terminal extension in Pf-PPase assumes a different structure that is not involved in the homodimerization of Pf-PPase [48].

Human PPA1 forms a dimeric structure that is conserved in a subset of Family I PPases

In the crystal, human PPA1 exists as a homodimer. The two protomers in the homodimer assume a relative orientation that is analogous to an identical twin standing arm in arm, facing opposite directions (Figures 2C and 2D). Similar homodimers are observed in the structures of PPases from S. japonicum and S. cerevisiae (Sj-PPase and Y-PPase, PDB codes 4QLZ and 2IHP). Formation of the human PPA1 homodimer buries 1860 Å2 of solvent accessible surface area (SASA) from the two protomers, comparable to those in the Sj-PPase and Y-PPase homodimers (1830 Å2 and 2030 Å2respectively). The three crystals have different space groups (P21 21 21, P32, and P1 21 1). The presence of similar homodimers in the structures of the three PPases crystallized in different space groups and the large buried SASA of dimerization indicate that formation of the homodimer is unlikely due to crystal packing artifact.
The homodimerization of human PPA1 is driven by multiple factors including shape complementarity of the dimerization interface, hydrophobic contacts, intermolecular hydrogen bonds, and electrostatic interactions. A large number of residues, including Arg52, Trp53, Asn83, Phe85, Pro86, Lys88, Ser127, Val129, Asp165, Lys179, Pro180, Gly181, Tyr182, Ala185, Asp281, Lys282, Trp283, His285, and His286 are involved in homodimerization (Figures 3 and 4). These residues are scattered in six different regions within the primary sequence, with the three regions around Phe85 (strand β10 and the following loop), around Pro180 (linker between α1- α2 helices and n-terminus of α2-helix), and around Trp283 (310 helix η4, strand β17, and the following loop) having most of the residues (Figure 1). Importantly, residues within the C-terminal extension play an important role in dimerization by not only directly participating in the interfacial interactions but also holding the β10 region in position.
A large portion of the interface molecular surface is defined by the sidechains of the hydrophobic residues and aliphatic parts of the sidechains of other residues (white to orange red surface areas in Figure 4). Hydrophobic contacts in the dimerization interface are extensive (Figure 3). Some examples include the contacts mediated by Phe85, Pro86, and Trp283 (Figure 4), which are conserved among the three eukaryotic PPases (human PPA1, Sj-PPase, and Y-PPase, see Figure 1).
There are several hydrogen bonds at the dimerization interface. Except the one between the backbone oxygen of Asp281 and sidechain of Arg52, all other hydrogen bonds are mediated by structured water molecules (Figure 3). Two electrostatic interactions are observed, between the sidechains of the Asp165-Lys282 and Asp281-Arg52 pairs. The Asp281-Arg52 pair is conserved in human PPA1, Sj-PPase, and Y-PPase (Figure 1).
Soluble Family I PPases are known to exist in different oligomeric states. Prokaryotic PPases form hexamers under physiological conditions [37-39, 46, 47, 49]. All but one know eukaryotic PPases structures form dimers [40-45, 48]. The exception is TbbVSP1, which forms a tetramer (dimer of dimer) [50]. The crystal structure of human PPA1 reveals a dimerization mode that is conserved in the Sj-PPase and Y-PPase. The C-terminal extensions in these PPases are critically involved in dimerization. Other eukaryotic PPases either do not have a C-terminal extension (Tg-PPase and TbbVSP1) or has a C-terminal extension in different configurations (Pf-PPase). The dimerization modes in Tg-PPase, TbbVSP1, and Pf-PPase are different from each other, and different from the conserved mode among human PPA1, Sj-PPase, and Y-PPase [50]. The available crystal structures of eukaryotic PPases show that diverse modes of dimerization exist in eukaryotic soluble Family I PPases. Previously, it was proposed (based on sequence conservation) that soluble Family I PPases from animal and fungi might share a similar mode of dimerization [50]. The crystal structure of human PPA1 provides a strong piece of evidence to support this proposal.
Although it is now generally believed that soluble Family I PPases exist in a multimeric state, the functional roles of multimerization and its structural diversity on PPase function, if any, are not known. In the case of human PPA1 (and the homologous Sj-PPase and Y-PPase), dimerization places the two active sites (one from each protomer) on opposite molecular surface of the dimer, far away and isolated from each other (Figures 2C and 2D). Catalytic reactions at the two active sites should be independent to each other. Residues involved in dimerization are also far away from the active site in the monomeric structure, dimerization should only have allosteric effect, if any, on the active site. Of course, even if dimerization is not required for the phosphatase function of human PPA1 under physiological condition, it may still be relevant to other PPA1 function(s) that is not known currently.
A very outstanding feature of the human PPA1 dimeric structure is the presence of a large cleft at the dimerization interface (Figure 2D). The size of the cleft can easily accommodate a 4-turn α-helix. Whether this cleft represents a functional site of human PPA1 is not known and deserves further studies. From the perspective of structure-based drug development suing human PPA1 as a target, the dimerization interface cleft may serve as a useful allosteric target site. In the case of Mt-PPase, inhibitors bind in a non-conserved interface between monomers of the hexameric structure were identified, which block the hydrolysis reaction in an uncompetitive and allosteric manner [36].

Human PPA1 has a largely pre-organized active site

Extensive studies had been carried out on the active site structures and catalytic mechanisms of Y-PPase [40-45]. These previous knowledges are critical for analyzing the active site of human PPA1 in the current structure.
The human PPA1 structure does not contain the substrate at the putative active site. To gain insights into how the speculative active site residues (inferred from structure-based sequence alignment with Y-PPase) orient in relative to the pyrophosphate substrate, the substrate was modeled into the human PPA1 structure by superimposition of the structure with a substrate-bound, fluoride-inhibited Y-PPase structure (PDB code 1RE6A) [45]. The overall structures superimposed well with a small RMSD of 0.98 Å (Figure 2B). The portions of structures that define the active site show only very slight difference in backbone and sidechain conformations (Supplementary Figure S1). The superimposed coordinates of the pyrophosphate were merged with the PPA1 coordinates to generate the structure of PPA1 with a pyrophosphate at the active site.
Figure 5A shows the active site structure of human PPA1 with a modeled pyrophosphate substrate. The 14 human PPA1 active site residues (matching the established active site residues in Y-PPase) are Glu49, Lys57, Glu59, Arg79, Tyr94, Gly95, Asp116, Asp118, Asp121, Asp148, Asp153, Lys155, Tyr193, Lys194 (Y-PPase residue numbers are one less than the shown human PPA1 residue numbers). Most of these residues locate in the β-barrel and the adjacent β6-strand, which are well conserved and defined in all structures of soluble Family I PPases. Tyr193 and Lys194 make the transition from the 310-helix η2 to the α2-helix. The active side residues can divided into two groups based on relative spatial relationship to the pyrophosphate substrate. The first group consists of all of the negatively charged residues and Gly95. These residues cluster on one side of the active site, close to the P2 phosphorus atom. The second group includes all of the positively charged residues and the two tyrosines. These residues largely locate on the other side of the active site, close to the P1 phosphorus atom (Figure 5A). Studies on Y-PPase reveal that the active site carboxylates are responsible for coordinating with 4 metal ions and activating a water molecule as the reaction nucleophile, while the positively charged sidechains stabilize the transition state and leaving group [40-45].
Due to the fact that most of the active site residues are located in the rigid core of the structure, their positions (locations of the backbone atoms) should have little changes upon substrate binding and during the reaction. Of course, conformations of the sidechains (especially the longer sidechains of lysine and arginine residues) could be changed. Comparison of the apo-structure of human PPA1 and the substrate-bound, fluoride-inhibited Y-PPase structure shows that the active site residues superimpose well with some conformation differences in the sidechains of Lys57, Arg79, and Lys194 (Supplementary Figure S1). Low B-factor values for residues at the active site also suggest relatively lower flexibility of these residues (Figure 5B). These data and analysis indicates that the active site of human PPA1 is largely pre-organized, which would minimize the need for conformational reorganization during catalysis.
The active site of human PPA1 has the potential to accommodate double-phosphorylated peptides from JNK1
While it is well established that PPases catalyze the hydrolysis of pyrophosphate, several recent studies reveal that PPA1 may also function as a protein phosphatase [11, 22, 61]. It was found that human PPA1 could directly dephosphorylate phosphorylated JNK1 in both phosphor-peptide and phosphor-protein levels, while no catalytic activity towards pERK or p-p38 was detected [22]. PPA1-silencing significantly down-regulated colon cancer cell proliferation. This antiproliferation effect is impaired by JNK inhibitor, indicating that the role of PPA1 in colon cancer is at least partially related to regulation of JNK activity. Evidence for the function of PPA1 as a pJNK phosphatase was also obtained in studies on neuronal differentiation in mouse, rat, and chick embryo [11, 61]. It was shown that PPA1 knockdown or overexpression led to increased or decreased JNK phosphorylation level, while no alteration of JNK phosphorylation level was detected after treatment with a catalytically inactive PPA1 mutant. PPA1 may play a role in neuronal differentiation via JNK dephosphorylation. JNK is a member of the mitogen-activated protein (MAP) kinase family. JNK regulates the activity of numerous downstream molecules, including c-Jun, p53, and Bcl2, by phosphorylation. JNK is activated by a dual phosphorylation of Thr183 and Tyr185 within a 180-FMMTPYVV motif. Dephosphorylation of JNK by protein phosphatases inactivates the enzyme [62].
So far, structural and mechanistic studies on soluble Family I PPases only concerned inorganic pyrophosphate as the substrate. Given the emerging role of PPA1 as a JNK phosphatase, we carried out modeling studies to investigate whether the known pyrophosphate binding active site could also accommodate phosphor-peptides.
Four structural models were constructed, each with a short phosphor-peptide bound at the known pyrophosphate binding active site. The phosphorus atom of the phosphortyrosine or phosphorthreonine residue within the phosphor-peptide is located at the same position as the P2 phosphorus atom of a pyrophosphate substrate. The sequences of the four phosphor-peptides match either JNK1 or the MAP kinase Erk2. A JNK1-derived pentapeptide 182-MTpPYpV was used to model the Yp at the active site (Figure 6A). A JNK1- derived hexapeptide 181-MMTpPYpV was used to model the Tp at the active site (Figure 6B). Two Erk2-derived tetrapeptides, 184-EYpVA and 181-FLTpE, were used to model the Yp and the Tp at the active site respectively (Supplementary Figure S2).
All of the phosphor-peptides can be accommodated in the active site pocket, without steric crash between the phosphor-peptide and PPA1. The PPA1 structures with a bound phosphor-peptide showed only very minor conformational changes (during energy minimization in Chimera) for residues on the outer portion of the active site pocket. Superimpositions of the phosphor-peptide bound PPA1 structures with the apo-structure give RMSD values less than 0.2 Å. The (putative) active site of human PPA1 that would bind the pyrophosphate substrate is located at the inner and deep portion of a cavity (Figure 5B. The location of the active site is indicated by the black arrow. The cavity is indicated by the white oval). The outer portion of the cavity becomes much larger, expanding downward and leftward while getting shallower.
The pyrophosphate binding site is mostly defined by residues from the inner β-core and therefore relatively rigid and pre-organized. The outer portion of the cavity is defined by residues from peripheral structures, including the strands β8 and β9, the 310 helix η2 , the loop connecting β6 and β7, and the loop connecting β12 and β13 (Figure 5A). Since many of these residues are surface exposed, their sidechains (especially the longer sidechains in Lys63, Lys74, Lys75, Glu149, Glu151, and Lys199) are expected to have large flexibilities. The B-factor surface rendering may also reflect the different flexibilities of atoms in different areas of the cavity (Figure 5B). These properties of the cavity explain how the cavity can accommodate various phosphor-peptides in the modeling studies. While the phosphor-residue is reaching into the inner (and deeper) portion of the cavity, the flanking residues are accommodated by the outer, shallower, and more flexible portion of the cavity.
These modeling results indicate that the active site pocket known for pyrophosphate binding and catalysis has the potential to accommodate various phosphor-peptides with single- or double-site phosphorylation. It should be pointed out that the models only minimized steric crash while keeping good geometry of the peptides and protein. No effort was made to optimize the inter-molecular interactions (such as hydrogen bonds, hydrophobic contacts, electrostatic interaction, etc). When these interactions come into play in reality, the PPA1 active site pocket might have substrate specificity for certain phosphor-peptides. Of note, the six charged residues Lys63, Lys74, Lys75, Glu149, Glu151, and Lys199 are placed in different areas in the pocket (in Figure 5B, Lys 63 at bottom left, Lys74 and Lys75 at top left, Lys199 at top right, Glu149 and Glu151 on the left). The peculiar distribution of charged residues may have implications in peptide substrate specificity. Of course, the exact structural details of peptide substrate recognition by human PPA1 can only be revealed by high resolution structures of PPA1 in complex with phosphor-peptide or phosphor-protein substrates. It is known that human PPA1 can directly dephosphorylate pJNK1 but cannot dephosphorylate pERK or p-p38 [22]. Structures of the PPA1-pJNK1 complexes are needed to reveal the molecular basis of this substrate specificity. The structures will also be very helpful for the development of peptide or peptide-like inhibitors of human PPA1.