Full "laplacianised" posterior naive Bayesian algorithm

Hamse Y. Mussa*, John B.O. Mitchell, Robert C. Glen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)


Background: In the last decade the standard Naive Bayes (SNB) algorithm has been widely employed in multi-class classification problems in cheminformatics. This popularity is mainly due to the fact that the algorithm is simple to implement and in many cases yields respectable classification results. Using clever heuristic arguments "anchored" by insightful cheminformatics knowledge, Xia et al. have simplified the SNB algorithm further and termed it the Laplacian Corrected Modified Naive Bayes (LCMNB) approach, which has been widely used in cheminformatics since its publication. In this note we mathematically illustrate the conditions under which Xia et al.'s simplification holds. It is our hope that this clarification could help Naive Bayes practitioners in deciding when it is appropriate to employ the LCMNB algorithm to classify large chemical datasets. Results: A general formulation that subsumes the simplified Naive Bayes version is presented. Unlike the widely used NB method, the Standard Naive Bayes description presented in this work is discriminative (not generative) in nature, which may lead to possible further applications of the SNB method. Conclusions: Starting from a standard Naive Bayes (SNB) algorithm, we have derived mathematically the relationship between Xia et al.'s ingenious, but heuristic algorithm, and the SNB approach. We have also demonstrated the conditions under which Xia et al.'s crucial assumptions hold. We therefore hope that the new insight and recommendations provided can be found useful by the cheminformatics community.

Original languageEnglish
Article number37
JournalJournal of Cheminformatics
Issue number8
Publication statusPublished - 23 Aug 2013


  • Cheminformatics
  • Classifications
  • Laplacian corrected modified Naive Bayes
  • Naive Bayes


Dive into the research topics of 'Full "laplacianised" posterior naive Bayesian algorithm'. Together they form a unique fingerprint.

Cite this