4. Discussion
The ability of a protein to make and break interactions with another protein in order to bind to it temporarily and perform a certain function allows for transient association between the proteins. The binding event is mostly brought about by non-covalent bonds at the interface between the interacting proteins which also determine the nature and strength of interaction. The implication of such an event can range from no structural change (due to weak interaction) to long range (allosteric communication) of the perturbation signal. The focus of our work was to study how direct associations play a role in perturbing the connectivity of residues in globular proteins. The effect of the perturbation at, around and away from the interface is studied by constructing a protein structural network (PSN) of the connectivity within the protein and analysing the alteration of the network.
A collection of structures from the ProPairs database was used for the analysis. After filtering structures based on their crystallographic properties, 895 protein chains have been identified that transiently associate with other proteins and have the same oligomeric state in the bound complex as well as the unbound form. The structure of the protein chain obtained as a part of a protein complex, which is the bound form of the protein, is compared with the structure of the same protein chain when it is not bound to the interacting partner but yet in the same oligomeric state. At the local level, the change in network parameters such as number of edges, hubs and centrality measures are studied. The global comparison is made in terms of the topological change, as in, their backbone deviation (RMSD) and in terms of their structural network dissimilarity (NDS).
Graph spectral comparison methods are used in computing the dissimilarity between the networks which involves spectral decomposition to obtain the eigen vectors and eigen values of the PSNs. Few case studies with high network variation were identified using the global and basic network comparison. A major contribution to the NDS in these cases arises from its component, EWCS which is mostly responsible to compute the change in local clustering of residues. The Fiedler vectors (Fv) between a pair of PSNs can be examined to identify the sites with high variations in the clustering of nodes. The eigen vector corresponding to the second smallest eigen value is called the Fiedler vector. This vector can provide meaningful information on the algebraic connectivity of the network and can be used in partitioning the network into clusters. This is illustrated with a case study.
The first step is to identify the Fv from the spectra of the pair of graphs. Figure 3A shows the aligned Fv between the bound and the unbound form of the DLD protein. The absolute difference between the aligned vectors is computed to find regions of the protein that are not in agreement. Figure 3B shows the difference between the Fv and highlights the region with variation between the vectors. The sites with the most variation, having the highest absolute difference are shown as sticks in Figure 3C. The cartoon diagram of the chain A of DLD is coloured based on the absolute difference between the vectors and chain B is shown as grey surface. The interacting partner E3BP protein is shown using yellow surface representation. Side chains of the top five residues with highest absolute difference, yellow in the bound form and red in the unbound form, are shown using spheres.
Any alteration of network parameters close to the site of interface is expected as the interfacial sites make new interactions with the binding partner. However, more often, alterations are also observed far from the site of binding due to allostery which is the transmission of the perturbation. The path of this allosteric signal can be analysed by drawing the shortest path between the site of perturbation to the site of significant network alteration. The change in shortest path between the site of binding to the site of perturbation is analysed in the case study and discussed in the Supplementary Figure 4. A new edge between spatially proximal nodes GLU 437 and ASP 350 in the bound form reduces the shortest path when compared to the several possible short paths in the unbound form.
Most of the variability observed in the dataset occurs at the non-interfacial sites. This is also evident from the variation of degree and strength at, around and away from the interface observed in Supplementary Figure 2. The network variation obtained only by considering the structure network of non-interface sites says that in almost 89% of the cases the network dissimilarity is greater than 50% of the NDS scored from all residues. This result also suggest that most of the network away from the binding site is affected by the perturbation, but all sites are not perturbed proportionally. As viewed from the absolute difference between Fv of the DLD protein case study only five nodes of the entire length of the long protein were strongly affected to cluster differently. The effect on all other sites are feeble and is an effect of subtle changes in the local conformation of sidechains. Which shows that most residues predominantly still remain in the same topology, any interactions that are broken are counteracted by other interactions being made. Hence there is predominantly a rearrangement of interactions that is being observed. However, in about 60% of the 285 cases that are identified as enzymes, in the working dataset, a net loss in connectivity is observed. Hence when the interacting protein is an enzyme, the structure of the bound form may be less compact than the unbound form which may serve the process of functioning to catalyse several different reactions.
The case studies have been chosen such that their analysis has certain clinical relevance and an impact on understanding their mechanism. The alteration of the network in all the case studies can be related to several human disease conditions like antibacterial resistance, diabetes, toxin induced cell death and Latic acidosis. We also related to known information about the mechanisms of their function. Hence, the analysis of the structure network is a necessary and beneficial tool in the analysis of structural excursions. The development of such tools that can analyse the impact of protein-protein interactions will help in understanding allostery mechanism and the network analysis of protein structures for stability engineering and docking studies.