A note on utilising binary features as ligand descriptors

Hamse Yussuf Mussa, John B. O. Mitchell, Robert Glen

Research output: Contribution to journalArticlepeer-review


It is common in cheminformatics to represent the properties of a ligand as a string of 1’s and 0’s, with the intention of elucidating, inter alia, the relationship between the chemical structure of a ligand and its bioactivity. In this commentary we note that, where relevant but non-redundant features are binary, they inevitably lead to a classifier capable of capturing only a linear relationship between structural features and activity. If, instead, we were to use relevant but non-redundant real-valued features, the resulting predictive model would be capable of describing a non-linear structure-activity relationship. Hence, we suggest that real-valued features, where available, are to be preferred in this scenario.
Original languageEnglish
Article number58
JournalJournal of Cheminformatics
Publication statusPublished - 1 Dec 2015


  • Binary descriptors
  • Ligand chemical structure
  • Linear relationship
  • Bernoulli distribution


Dive into the research topics of 'A note on utilising binary features as ligand descriptors'. Together they form a unique fingerprint.

Cite this