About ProAffiMuSeq

ProAffiMuSeq is a webserver to calculate the binding free energy change (ΔΔG in kcal/mol) for mutant protein-protein complexes. With 10-fold cross-validation, ProAffiMuSeq shows a correlation of 0.73 and a mean absolute error (MAE) of 0.86 kcal/mol. In the test dataset, the performance remains consistent with a correlation of 0.75 with MAE of 0.94 kcal/mol. In a blind dataset of 473 mutations (Geng et al. 2019) it showed a correlation and MAE of 0.27 and 1.06 kcal/mol, respectively. This is comparable to structure-based methods. Further, our method showed a MAE of 1.21 kcal/mol, when tested with a set of 552 additional non-redundant interface mutations in 80 complexes deposited in SKEMPI 2.0 (Jankauskaitė et al. 2019). ProAffiMuSeq uses functional information about the complexes and sequence-based features to predict the ΔΔG_bind value. ProAffiMuSeq is unique as it does not require the complex structure to predict the ΔΔG value. Thus, it can be used for complexes which do not have a known structure.

What are protein-protein complexes?

Protein-protein complexes are made up of two or more proteins which are associated with each other by non-covalent bonds (electrostatic interactions, van der Waals forces, hydrogen bonds, hydrophobic interactions etc.). These complexes mediate crucial functions in the cell, including signalling, immunity, metabolism and transport. For example, Ras protein acts as a molecular switch which can activate different pathways for cell growth, differentiation and apoptosis by binding with different effectors (such as Raf, RalGDS, PI3K, Nore1 etc.). Interleukin-4 with its receptor can recruit eosinophils, is involved in IgE secretion and helps in the differentiation of T-helper cells. For detailed reviews of protein-protein interactions, see the following references.

Jones S and Thornton JM (1996). Principles of protein-protein interactions. Proc Natl Acad Sci USA. 93 (1): 13-20.
Stites WE (1997) Protein-Protein Interactions: Interface Structure, Binding Thermodynamics, and Mutational Analysis. Chem Rev, 97 (5): 1233-1250.
Ali MH and Imperiali B (2005) Protein oligomerization: How and why. Bioorgan Med Chem, 13 (17): 5013-5020.
Keskin O et. al. (2008) Principles of Protein-Protein Interactions: What are the Preferred Ways For Proteins To Interact? Chem Rev, 108 (4): 1225-1244.
Janin J (2009) Basic Principles of Protein-Protein Interaction. In R Nussinov and G Schreiber (Eds), Computational Protein-Protein Interactions., 1-19, Boca Raton: CRC Press.

Functional classes

Each protein-protein interaction has a specific function. We have developed models for 6 functional classes, described below:

Antigen-antibody complex	Complex formed between an antibody and any antigen, important for immunity
Enzyme-inhibitor complex	Complex between and enzyme and an inhibitor protein; the inhibitor is often tightly bound to the enzyme and thus regulates enzyme function
G-protein complex	Complex formed between G-proteins (which bind GTP/GDP, possess GTPase activity) and any effector; the complex is often involved in signal transduction
Receptor complex	Receptor bound to a ligand protein, such as growth hormone bound to its receptor
Other enzyme complex	Complex between the enzyme and any non-inhibitory protein, such as the enzyme substrate
Miscellaneous	Heteromeric complexes which do not fall in any of the above classes

Additionally, we have also developed a model for homodimers (made of two identical proteins), which can be accessed in the Homodimer tab of the Predict page.

Binding free energy upon mutation

A protein usually collides with its binding partner multiple times in the course of diffusion. If the two proteins are close to native conformation of the complex, they form an intermediate complex and proceed to bind. The equilibrium constant for dissociation can be expressed in terms of the concentrations of the two proteins and is given by:

We can calculate the free energy of binding using the dissociation constant as follows:

where K is the equilibrium constant, R is the gas constant (0.0019 kcal/K mol), T is the temperature in Kelvin and [AB], [A] and [B] refer to the concentration of the complex and the free proteins.

Mutations at the PPI interface can affect the binding affinity and stability by disrupting bonds and causing conformational changes. This, in turn, can potentially affect the ability of the proteins to perform their functions and may cause disease.

For a mutation, the change in binding free energy is given by:

Binding free energy can be determined experimentally using isothermal titration calorimetry, surface plasmon resonance, spectroscopy or fluorescence-based methods. However, these methods are time-consuming, require the expression and purification of proteins and specialized equipment. This makes computational prediction of changes in binding free energy an attractive alternative.

Brief methodology of ProAffiMuSeq

We used training and test data from PROXiMATE (Jemimah, Yugandhar and Gromiha, 2017) and sequence-based features mainly derived from AAIndex, as well as conservation scores and PSSM. Feature selection was done in 2 stages: an initial exhaustive search over all 4-feature combinations followed by forward selection. Separate models have been trained for each functional class and each side of the interface. For instance, for the enzyme-inhibitor class, we have one model for the enzyme mutations, and another for the inhibitor mutations. Based on the user’s input, the appropriate model is selected to predict the ΔΔG value for the given mutations.

Performance

With 10-fold cross-validation, ProAffiMuSeq shows a correlation of 0.73 and a mean absolute error (MAE) of 0.86 kcal/mol. In the test dataset, the performance remains consistent with a correlation of 0.75 with MAE of 0.94 kcal/mol. We also tested our method on an external validation dataset of 473 mutations, derived from the validation datasets published for iSEE (Geng et al., 2019). ProAffiMuSeq's performance is comparable to existing structure-based methods. The performance metrics have been tabulated below.

	iSEE	FoldX	mCSM	BindProfX	ProAffiMuSeq
Pearson's correlation	0.26	0.35	0.25	0.39	0.27
Spearman correlation	0.22	0.41	0.28	0.41	0.27
MAE (kcal/mol)	0.99	1.07	0.96	0.91	1.06
RMSE (kcal/mol)	1.30	1.50	1.34	1.21	1.41

Further, our method showed a MAE of 1.21 kcal/mol, when tested with a set of 552 additional non-redundant interface mutations in 80 complexes deposited in SKEMPI 2.0 (Jankauskaitė et al. 2019).

References

Geng C, Vangone A, Folkers GE, Xue LC, Bonvin AMJJ (2019). iSEE: Interface structure, evolution, and energy-based machine learning predictor of binding affinity changes upon mutations Proteins. 87 (2): 110-119.

Gromiha MM, Yugandhar K and Jemimah S (2016). Protein-protein interactions: scoring schemes and binding affinity. Curr Opin Str Biol. 44: 31-38.

Jankauskaitė J, Jimėnez-García B, Dapkunas J, Fernández-Recio J, Moal IH (2019). SKEMPI 2.0: an updated benchmark of changes in protein-protein binding energy, kinetics and thermodynamics upon mutation. Bioinformatics. 35 (3): 462-469.

Jemimah S, Yugandhar K and Gromiha MM (2017). PROXiMATE: a database of mutant protein–protein complex thermodynamics and kinetics. Bioinformatics. 33 (17): 2787-2788.