The mitochondrial F1FO-ATPase in the presence of the natural cofactor Mg2+ acts as the enzyme of life by synthesizing ATP, but it can also hydrolyze ATP to pump H+. Interestingly, Mg2+ can be replaced by Ca2+, but only to sustain ATP hydrolysis and not ATP synthesis. When Ca2+ inserts in F1, the torque generation built by the chemomechanical coupling between F1 and the rotating central stalk was reported as unable to drive the transmembrane H+ flux within FO. However, the failed H+ translocation is not consistent with the oligomycin-sensitivity of the Ca2+-dependent F1FO-ATP(hydrol)ase. New enzyme roles in mitochondrial energy transduction are suggested by recent advances. Accordingly, the structural F1FO-ATPase distortion driven by ATP hydrolysis sustained by Ca2+ is consistent with the permeability transition pore signal propagation pathway. The Ca2+-activated F1FO-ATPase, by forming the pore, may contribute to dissipate the transmembrane H+ gradient created by the same enzyme complex.
Normal Mode Analysis is a fast and inexpensive approach that is largely used to gain insight into functional protein motions, and more recently to create conformations for further computational studies. However, when the protein structure is unknown, the use of computational models is necessary. Here, we analyze the capacity of normal mode analysis in internal coordinate space to predict protein motion, its intrinsic flexibility and atomic displacements, using protein models instead of native structures, and the possibility to use it for model refinement. Our results show that normal mode analysis is quite insensitive to modelling errors, but that calculations are strictly reliable only for very accurate models. Our study also suggests that internal normal mode analysis is a more suitable tool for the improvement of structural models, and for integrating them with experimental data or in other computational techniques, such as protein docking or more refined molecular dynamics simulations.
Polyene polyketides amphotericin B (AMB) and nystatin (NYS) are important antifungal drugs. Thioesterases (TEs), located at the last module of PKS, control the release of polyketides by cyclization or hydrolysis. Intrigued by the tiny structural difference between AMB and NYS, as well as the high sequence identity between AMB TE and NYS TE, we constructed four systems to study the structural characteristics, catalytic mechanism, and product release of AMB TE and NYS TE with combined MD simulations and QM/MM calculations. The results indicated that compared with AMB TE, NYS TE shows higher specificity on its natural substrate and R26 as well as D186 were proposed to a key role in substrate recognition. The energy barrier of macrocyclization in AMB-TE-Amb and AMB-TE-Nys systems were calculated to be 14.0 and 22.7 kcal/mol, while in NYS-TE-Nys and NYS-TE-Amb systems, their energy barriers were 17.5 and 25.7 kcal/mol, suggesting the cyclization with their natural substrates were more favorable than that with exchanged substrates. At last, the binding free energy obtained with the MM-PBSA.py program suggested that it was easier for natural products to leave TE enzymes after cyclization. And key residues to the departure of polyketide product from the active site were highlighted. We provided a catalytic overview of AMB TE and NYS TE including substrate recognition, catalytic mechanism and product release. These will improve the comprehension of polyene polyketide TEs and benefit for broadening the substrate flexibility of polyketide TEs.
The Protein Data Bank (PDB) file format remains a popular format used and supported by many software to represent coordinates of macromolecular structures. It however suffers from drawbacks such as error-prone manual editing. Because of that, various software toolkits have been developed to facilitate its editing and manipulation, but, to date, there is no online tool available for this purpose. Here we present PDB-Tools Web, a flexible online service for manipulating PDB files. It offers a rich and user-friendly graphical user interface that allows users to mix-and-match more than 40 individual tools from the pdb-tools suite. Those can be combined in a few clicks to perform complex pipelines, which can be saved and uploaded. The resulting processed PDB files can be visualized online and downloaded. The web server is freely available at https://wenmr.science.uu.nl/pdbtools.
To greatly expand the druggable genome, fast and accurate predictions of cryptic sites for small molecules binding in target proteins are in high demand. In this study, we have developed a fast and simple conformational sampling scheme guided by normal modes solved from the coarse-grained elastic models followed by atomistic backbone refinement and sidechain repacking. Despite the observations of complex and diverse conformational changes associated with ligand binding, we found that simply sampling along each of the lowest 30 modes is near optimal for adequately restructuring cryptic sites so they can be detected by existing pocket finding programs like fpocket and concavity. We further trained machine-learning protocols to optimize the combination of the sampling-enhanced pocket scores with other dynamic and conservation scores, which only slightly improved the performance. As assessed based on a training set of 84 known cryptic sites and a test set of 14 proteins, our method achieved high accuracy of prediction (with area under the receiver operating characteristic curve > 0.8) comparable to the CryptoSite server. Compared with CryptoSite and other methods based on extensive molecular dynamics simulation, our method is much faster (1-2 hours for an average-size protein) and simpler (using only pocket scores), so it is suitable for high-throughput processing of large datasets of protein structures at the genome scale.
The FastDesign protocol in the molecular modeling program Rosetta iterates between sequence optimization and structure refinement to stabilize de novo designed protein structures and complexes. FastDesign has been used previously to design novel protein folds and assemblies with important applications in research and medicine. To promote sampling of alternative conformations and sequences, FastDesign includes stages where the energy landscape is smoothened by reducing repulsive forces. Here, we discover that this process disfavors larger amino acids in the protein core because the protein compresses in the early stages of refinement. By testing alternative ramping strategies for the repulsive weight, we arrive at a scheme that produces lower energy designs with more native-like sequence composition in the protein core. We further validate the protocol by designing and experimentally characterizing over 4000 proteins and show that the new protocol produces higher stability proteins.
Predicting the range of substrates accepted by an enzyme from its amino acid sequence is challenging. Although sequence- and structure-based annotation approaches are often accurate for predicting broad categories of substrate specificity, they generally cannot predict which specific molecules will be accepted as substrates for a given enzyme, particularly within a class of closely related molecules. Combining targeted experimental activity data with structural modeling, ligand docking, and physicochemical properties of proteins and ligands with various machine learning models provides complementary information that can lead to accurate predictions of substrate scope for related enzymes. Here we describe such an approach that can predict the substrate scope of bacterial nitrilases, which catalyze the hydrolysis of nitrile compounds to the corresponding carboxylic acids and ammonia. Each of the four machine learning models (linear regression, random forest, gradient-boosted decision trees, and support vector machines) performed similarly (average ROC = 0.9, average accuracy = ~82%) for predicting substrate scope for this dataset. The approach is intended to be highly modular with respect to physicochemical property calculations and software used for docking and modeling.
Natural products and natural product-derived compounds have been widely used for pharmaceuticals for many years, and the search for new natural products that may have interesting activity is on going. Abyssomicins are natural product molecules that have antibiotic activity via inhibition of the folate synthesis pathway in microbiota. These compounds also appear to undergo a required [4+2] cycloaddition in their biosynthetic pathway. Here we report the structure of an FAD-dependent reductase, AbsH3, from the biosynthetic gene cluster of novel abyssomicins found in Streptomyces sp. LC-6-2.
The focal adhesion kinase (FAK) and the proline-rich tyrosine kinase 2-beta (PYK2) are implicated in cancer progression and metastasis and represent promising biomarkers and targets for cancer therapy. FAK and PYK2 are recruited to Focal Adhesions (Fas) via interactions between their Focal Adhesion Targeting (FAT) domains and conserved segments (LD motifs) on the proteins Paxillin, Leupaxin and Hic-5. A promising new approach for the inhibition of FAK and PYK2 targets interactions of the FAK domains with proteins that promote localization at Focal Adhesions. Advances toward this goal include the development of surface plasmon resonance, HSQC-NMR and fluorescence polarization assays for the identification of fragments or compounds interfering with the FAK-Paxillin interaction. We have recently validated this strategy, showing that Paxillin mimicking polypeptides with 2-3 LD motifs displace FAK from FAs and block kinase-dependent and independent functions of FAK, including downstream integrin signalling and FA localization of the protein p130Cas. In the present work we study by all-atom molecular dynamics simulations the recognition of peptides with the Paxillin and Leupaxin LD motifs by the FAK-FAT and PYK2-FAT domains. Our simulations and free-energy analysis interpret experimental data on binding of Paxillin and Leupaxin LD motifs at FAK-FAT and PYK2-FAT binding sites, and assess the roles of consensus LD regions and flanking residues. Our results can assist in the design of effective inhibitory peptides of the FAK-FAT:Paxillin and PYK2-FAT:Leupaxin complexes and the construction of pharmacophore models for the discovery of potential small-molecule inhibitors of the FAK-FAT and PYK2-FAT focal adhesion based functions.
Isoflavonoid is one of the groups of flavonoids that play pivotal roles in the survival of land plants. Chalcone synthase (CHS), the first enzyme of the isoflavonoid biosynthetic pathway, catalyzes the formation of a common isoflavonoid precursor. We have previously reported that an isozyme of soybean CHS (termed GmCHS1) is a key component of the isoflavonoid metabolon, a protein complex to enhance efficiency of isoflavonoid production. Here, we determined the crystal structure of GmCHS1 as a first step of understanding the metabolon structure, as well as to better understand the catalytic mechanism of GmCHS1.
This paper reports on the results of research aimed to translate biometric 3D face recognition concepts and algorithms into the field of protein biophysics in order to precisely and rapidly classify morphological features of protein surfaces. Both human faces and protein surfaces are free-forms and some descriptors used in differential geometry can be used to describe them applying the principles of feature extraction developed for computer vision and pattern recognition. The first part of this study focused on building the protein dataset using a simulation tool and performing feature extraction using novel geometrical descriptors. The second part tested the method on two examples, first involved a classification of tubulin isotypes and the second compared tubulin with the FtSZ protein, which is its bacterial analogue. An additional test involved several unrelated proteins. Different classification methodologies have been used: a classic approach with a Support Vector Machine (SVM) classifier and an unsupervised learning with a k-means approach. The best result was obtained with SVM and the radial basis function (RBF) kernel. The results are significant and competitive with the state-of-the-art protein classification methods. This opens a new area for protein structure analysis.
Structural characterization of alternatively folded and partially disordered protein conformations remains challenging. Outer surface protein A (OspA) is a pivotal protein in Borrelia infection, which is the etiological agent of Lyme disease. OspA exists in equilibrium with intermediate conformations, in which the central and the C-terminal regions of the protein have lower stabilities than the N-terminal. Here, we characterize pressure- and temperature-stabilized intermediates of OspA by nuclear magnetic resonance spectroscopy combined with paramagnetic relaxation enhancement (PRE). We found that the C-terminal region of the intermediate was partially disordered; however, it retains weak specific contact with the N-terminal region, owing to a twist of the central β-sheet and increased flexibility in the polypeptide chain. The disordered C-terminal region of the pressure-stabilized intermediate was more compact than that of the temperature-stabilized form. Further, molecular dynamics simulation demonstrated that temperature-induced disordering of the β-sheet was initiated at the C-terminal region and continued through to the central region. An ensemble of simulation snapshots qualitatively described the PRE data from the intermediate and indicated that the intermediate structures of OspA may expose tick receptor-binding sites more readily than does the basic folded conformation.
Expansins have the remarkable ability to loosen plant cell walls and cellulose material without showing catalytic activity and therefore have potential applications in biomass degradation. To support the study of sequence-structure-function relationships and the search for novel expansins, the Expansin Engineering Database (ExED, https://exed.biocatnet.de) collected sequence and structure data on expansins from Bacteria, Fungi, and Viridiplantae, and expansin-like homologues such as carbohydrate binding modules, glycoside hydrolases, loosenins, swollenins, cerato-platanins, and EXPNs. Based on global sequence alignment and protein sequence network analysis, the sequences are highly diverse. However, many similarities were found between the expansin domains. Newly created profile hidden Markov models of the two expansin domains enable standard numbering schemes, comprehensive conservation analyses, and genome annotation. Conserved key amino acids in the expansin domains were identified, a refined classification of expansins and carbohydrate binding modules was proposed, and new sequence motifs facilitate the search of novel candidate genes and the engineering of expansins.
The M42 aminopeptidases are a family of dinuclear aminopeptidases widely distributed in Prokaryotes. They are potentially associated to the proteasome, achieving complete peptide destruction. Their most peculiar characteristic is their quaternary structure, a tetrahedron-shaped particle made of twelve subunits. The catalytic site of M42 aminopeptidases is defined by seven conserved residues. Five of them are involved in metal ion binding which is important to maintain both the activity and the oligomeric state. The sixth conserved residue, a glutamate, is the catalytic base deprotonating the water molecule during peptide bond hydrolysis. The seventh residue is an aspartate whose function remains poorly understood. This aspartate residue, however, must have a critical role as it is strictly conserved in all MH clan enzymes. It forms some kind of catalytic triad with the histidine residue and the metal ion of the M2 binding site. We assess its role in TmPep1050, an M42 aminopeptidase of Thermotoga maritima, through a mutational approach. Asp-62 was substituted with alanine, asparagine, or glutamate residue. The three Asp-62 substitutions completely abolished TmPep1050 activity and impeded dodecamer formation. They also interfered with metal ion binding as only one cobalt ion is bound per subunit instead of two. The structural data showed that the Asp62Ala substitution has an impact on the active site folds becoming similar to TmPep1050 dimer. We propose a structural role for Asp-62, helping to stabilize a crucial loop in the active site and to position correctly the catalytic base and a metal ion ligand of the M1 site.
Accurate prediction of protein secondary structure (alpha-helix, beta-strand and coil) is a crucial step for protein inter-residue contact prediction and ab initio tertiary structure prediction. In a previous study, we developed a deep belief network-based protein secondary structure method (DNSS1) and successfully advanced the prediction accuracy beyond 80%. In this work, we developed multiple advanced deep learning architectures (DNSS2) to further improve secondary structure prediction. The major improvements over the DNSS1 method include (i) designing and integrating six advanced one-dimensional deep convolutional/recurrent/residual/memory/fractal/inception networks to predict secondary structure, and (ii) using more sensitive profile features inferred from Hidden Markov model (HMM) and multiple sequence alignment (MSA). Most of the deep learning architectures are novel for protein secondary structure prediction. DNSS2 was systematically benchmarked on two independent test datasets with eight state-of-art tools and consistently ranked as one of the best methods. Particularly, DNSS2 was tested on the 82 protein targets of 2018 CASP13 experiment and achieved the best Q3 score of 83.74% and SOV score of 72.46%. DNSS2 is freely available at: https://github.com/multicom-toolbox/DNSS2.
Protein-protein interactions (PPIs) are ubiquitous and functionally of great importance in biological systems. Hence, the ac-curate prediction of PPIs by protein-protein docking and scoring tools is highly desirable in order to characterize their structure and biological function. Ab initio docking protocols are divided into the sampling of docking poses to produce at least one near-native structure, then to evaluate the vast candidate structures by scoring. Concurrent development in both sampling and scoring is crucial for the deployment of protein-protein docking software. In the present work, we apply a machine learning model on pairwise potentials to refine the task of protein quaternary structure native structure detection among decoys. A decoy set was featurized using the Knowledge and Empirical Combined Scoring Algorithm 2 (KECSA2) pairwise potential. The highly unbalanced decoy set was then balanced using a comparison concept between native and decoy structures. The resultant comparison descriptors were used to train a logistic regression (LR) classifier. The LR model yielded the optimal performance for native detection among decoys compared to conventional scoring functions, while exhibiting lesser performance for the detection of low root mean square deviation (RMSD) decoy structures. Its deployment on an independent benchmark set confirms that the scoring function performs competitively relative to other scoring functions. All data and scripts used are available at: https://github.com/TanemuraKiyoto/PPI-native-detection-via-LR .
Allostery governing two conformational states is one of the proposed mechanisms for catch-bond behavior in adhesion proteins. In FimH, a catch-bond protein expressed by pathogenic bacteria, separation of two domains disrupts inhibition by the pili domain. Thus, tensile force can induce a conformational change in the lectin domain, from an inactive state to an active state with high affinity. To better understand allosteric inhibition in two-domain FimH (H2 inactive), we use molecular dynamics simulations to study the lectin domain alone, which has high affinity (HL active), and also the lectin domain stabilized in the low-affinity conformation by an Arg-60-Pro mutation (HL mutant). Because ligand-binding induces an allostery-like conformational change in HL mutant, this more experimentally tractable version has been proposed as a “minimal model” for FimH. We find that HL mutant has larger backbone fluctuations than both H2 inactive and HL active, at the binding pocket and allosteric interdomain region. We use an internal coordinate system of dihedral angles to identify protein regions with differences in backbone and sidechain dynamics beyond the putative allosteric pathway sites. By characterizing HL mutant dynamics for the first time, we provide additional insight into the transmission of allosteric information across the lectin domain and build upon structural and thermodynamic data in the literature to further support the use of HL mutant as a “minimal model.” Understanding how to alter protein dynamics to prevent the allosteric conformational change may guide drug development to prevent infection by blocking FimH adhesion.