A Random Forest model for predicting allosteric and functional sites on proteins

Ava S.-Y. Chen, Nicholas J. Westwood, Paul Brear, Graeme W. Rogers, Lazaros Mavridis, John B. O. Mitchell

Research output: Contribution to journalArticlepeer-review

13 Citations (Scopus)


We created a computational method to identify allosteric sites using a machine learning method trained and tested on protein structures containing bound ligand molecules. The Random Forest machine learning approach was adopted to build our three-way predictive model. Based on descriptors collated for each ligand and binding site, the classification model allows us to assign protein cavities as allosteric, regular or orthosteric, and hence to identify allosteric sites. 43 structural descriptors per complex were derived and were used to characterize individual protein-ligand binding sites belonging to the three classes, allosteric, regular and orthosteric. We carried out a separate validation on a further unseen set of protein structures containing the ligand 2-(N-cyclohexylamino) ethane sulfonic acid (CHES).
Original languageEnglish
Pages (from-to)125-135
Number of pages11
JournalMolecular Informatics
Issue number3-4
Early online date21 Jan 2016
Publication statusPublished - 5 Apr 2016


  • Random Forest
  • Machine learning
  • Cheminformatics
  • Drug Design
  • Allosteric site


Dive into the research topics of 'A Random Forest model for predicting allosteric and functional sites on proteins'. Together they form a unique fingerprint.

Cite this