Personal profile
Research overview
Our research covers everything that is broadly both chemical and computational. Some of the main themes are described below, but if you're just after a list of publications, here you are.
- Machine Learning
A substantial part of computational chemistry involves building mathematical models to analyse data. The Machine Learning (ML) part of our work comprises everything that is not an attempt realistically to model the processes by which the real world actually works. In jargon, this is everything that is not physics-based. Such tasks might firstly be regression, that is predicting numerical values such as solubilities. Secondly they might be classification, assigning items such as molecules to classes like "toxic" or "non-toxic". Thirdly, they might be clustering, finding patterns in unlabelled data. In our group, we use such models to predict and calculate properties such as solubility, bioactivity and toxicity.
Such modelling in fact has a long history in chemistry, dating back to the 19th century. However, for much of that time models were limited to simple linear regressions. In the latter part of the 20th century, the field developed through building QSAR (Quantitative Structure-Activity Relationship) and QSPR (ditto, but now it's Structure-Property) models with multi-linear regression, and then onto non-linear methods. The field was usually known as chemoinformatics (or cheminfomatics, being unsure how to spell its own name). In the modern era, the sophistication of the models has increased to a point where it's more descriptive, and certainly more widely understood, to call these techniques Machine Learning.
What about Artifical Intelligence (AI) - can we cut a divide between ML and AI? Probably not a clear one. As Google DeepMind executive Mat Velloso said: “If it is written in Python, it's probably machine learning. If it is written in PowerPoint, it's probably AI.” While there's clearly a sbstantial overlap between the categories, we tend to refer to souped up non-linear regression models as ML, but to LLMs as AI. Nonetheless, under the lid, LLMs are just large neural networks doing neural network things like optimising weights.
- Molecular Simulation
By way of contrast, molecular simulation is definitely a physics-based approach. We set up in the computer a mathematical representation of the molecules involved, one that typically includes the chemical nature and spatial co-ordinates of each constituent atom. The computer then produces a possible future of that molecular system, calculating its response to its physical and chemical environment at each timestep to create a trajectory, in a process known as Molecular Dynamics.
If carried out intelligently, such methods can provide great scientific insight into the behaviour of the system, covering things such as structure, energy, interactions with other molecules, phase changes and much more. Typically, simulations are carried out with the molecules contained in 3-dimensional boxes that are stacked together without limit in all directions and fill space with no gaps, a scenario described as having "periodic boundary conditions." Our group use such methods for structural studies of the interactions between enzymes and their substrates, with applications like plastic-eating enzymes and new medicines.
The forces, or more explicitly the interaction energies, between molecules are defined by a "force field", which has little to do with science fiction but a lot to do with the fundamental physical processes governing the attractive and repulsive interactions amongst atoms and molecules. This forms a major part of the scientific input into simulations. Historically, force fields have been either fitted to experiment or parameterised via theoretical calculation, but increasingly they are now being generated through ML.
- Quantum Chemistry
For all the usefulness of simulations, typically their force fields know nothing about covalent bond making or breaking, which means that they can't be used to study chemical reactions, molecular orbitals or even the vibrational motions of molecules. Instead, a more chemically intelligent approach is required, and this is provided by the electronic structure methods of quantum chemistry. Such approaches are known as "first principles," due to their sound basis in atomic and molecular quantum mechanics.
The most foundational such method historically has been Hartree-Fock self-consistent field theory (HF). However, in this century, Density Functional Theory (DFT) has become a much more widely known and used alternative, largely because it generally gives a more accurate result at a lesser cost.
We use quantum chemical methods such as HF and DFT for a variety of applications, including the energetics of chemical reactions, development of force fields, physics-based calculation of solubility and the prediction of crystal structures. While our group are very much users rather than developers of quantum chemical methods, we appreciate their central role in computational chemistry.
- Bioinformatics
The sequential and alphabetical nature of both DNA and proteins makes them a rich source of computational research. Study of these essential and foundational biomolecules provides a window into the evolutionary history of life and its chemistry, as well as the impressive structural diversity of protein folds. Our own research frequently occupies the interface between chemistry and biology, the interactions between large biological polymers and smaller molecules being fundamental to processes of life and disease alike.
Much of our work in these areas has centred on enzymes, their chemical functions and their evolutionary histories. In this post-AlphaFold era, we continue to seek out new research questions that can shed light on the rich and diverse repertoire of biochemistry. In this endeavour, we frequently collaborate with collegues in Biology as well as Chemistry.
Additional information about the current Mitchell Group can be found here: https://jbomgroup.wp.st-andrews.ac.uk/
Research interests
Future research
Industrial relevance
Biography
John Mitchell has a PhD in Theoretical Chemistry from Cambridge. He returned there from University College London in 2000, taking up a lectureship in Chemistry. He was appointed to a readership at St Andrews in 2009. His recent research has used computational techniques in pharmaceutical chemistry and structural bioinformatics. His group have worked extensively on prediction of bioactivity, solubility, melting point and hydrophobicity from chemical structure, using both informatics and theoretical chemistry methodologies. Recently they have developed novel applications of machine learning in computational biochemistry, such as drug side effect prediction, and identifying athletic performance enhancers.
Profile Keywords
Machine Learning, Artificial Intelligence & informatics in Chemistry; Prediction of solubility and other molecular thermodynamic properties; Modelling the organic crystalline state; Classification and computer-based representation of enzyme reaction mechanisms; Bioinformatics studies of molecular evolution; Modelling protein-ligand interactions.
Teaching activity
Lecturer CH5714 Chemical Applications of Electronic Structure Calculations; Lecturer CH4431 Scientific Writing; Lecturer CH3717 Statistical Mechanics and Computational Chemistry; Convenor & Tutor, CH1202 Introductory Chemistry; Lecturer ID2005 Scientific Thinking; Tutor CH2701 Physical Chemistry 2; Tutor CH1401 Introductory Inorganic and Physical Chemistry; Tutor CH5461 Integrating Chemistry; Project Supervisor CH4442 & CH5441 Research Projects; Lecturer SUPACCH Computational Chemistry (Postgraduate course).
Education/Academic qualification
Doctor of Philosophy, Theoretical Studies of Hydrogen Bonding, University of Cambridge
1 Oct 1987 → 30 Sept 1990
Award Date: 2 Feb 1991
Keywords
- QD Chemistry
- solubility
- computational chemistry
- chemoinformatics
- bioinformatics
- Machine Learning
- Artificial Intelligence
- simulation
Expertise related to UN Sustainable Development Goals
In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This person’s work contributes towards the following SDG(s):
-
SDG 3 Good Health and Well-being
-
SDG 4 Quality Education
-
SDG 11 Sustainable Cities and Communities
-
SDG 13 Climate Action
-
SDG 14 Life Below Water
Fingerprint
- 1 Similar Profiles
Collaborations and top research areas from the last five years
-
Can human experts predict solubility better than computers?
Boobier, S., Osbourn, A. & Mitchell, J. B. O., 13 Dec 2017, In: Journal of Cheminformatics. 9, 63Research output: Contribution to journal › Article › peer-review
Open AccessFile -
Enzyme function and its evolution
Mitchell, J. B. O., Dec 2017, In: Current Opinion in Structural Biology. 47, p. 151-156Research output: Contribution to journal › Review article › peer-review
Open AccessFile -
Probing the average distribution of water in organic hydrate crystal structures with radial distribution functions (RDFs)
Skyner, R. E., Mitchell, J. B. O. & Groom, C., 28 Jan 2017, In: CrystEngComm. 19, 4, p. 641-652 12 p.Research output: Contribution to journal › Article › peer-review
Open AccessFile -
A review of methods for the calculation of solution free energies and the modelling of systems in solution
Skyner, R. E., McDonagh, J. L., Groom, C. R., van Mourik, T. & Mitchell, J. B. O., 17 Mar 2015, In: Physical Chemistry Chemical Physics. 17, 9, p. 6174-6191Research output: Contribution to journal › Article › peer-review
Open AccessFile -
One origin for metallo-β-lactamase activity, or two? An investigation assessing a diverse set of reconstructed ancestral sequences based on a sample of phylogenetic trees
Alderson, R. G., Barker, D. & Mitchell, J. B. O., Oct 2014, In: Journal of Molecular Evolution. 79, 3-4, p. 117-129 13 p.Research output: Contribution to journal › Article › peer-review
Open AccessFile
Datasets
-
DLS-100 Solubility Dataset
Mitchell, J. B. O. (Creator) & McDonagh, J. (Contributor), University of St Andrews, 31 Oct 2017
DOI: 10.17630/3a3a5abc-8458-4924-8e6c-b804347605e8
Dataset
File -
Data underpinning : Greedy and linear ensembles of machine learning methods outperform single approaches for QSPR regression problems
Kew, W. (Creator) & Mitchell, J. B. O. (Creator), University of St Andrews, 2015
Dataset
File -
Data underpinning: Probing the average distribution of water in organic hydrate crystal structures with radial distribution functions (RDFs)
Skyner, R. E. (Creator), Mitchell, J. B. O. (Creator) & Groom, C. (Contributor), Royal Society of Chemistry, 19 Dec 2016
https://doi.org/10.1039/C6CE02119K
Dataset
-
Data underpinning: Why do sequence signatures predict enzyme mechanism? Homology versus Chemistry
Beattie, K. (Creator), De Ferrari, L. (Creator) & Mitchell, J. B. O. (Creator), SAGE Publications Ltd STM, 2015
Dataset
Projects
- 2 Finished
-
Wellcome Trust 091959/Z/10/Z: International Genetically Engineered Machine Competition Student Stipends
Mitchell, J. B. O. (PI)
1/03/10 → 31/12/10
Project: Standard
-
Machine Learning Approaches to Predict: Machine Learning Approaches to Predict Enzyme Function
Mitchell, J. B. O. (PI) & De Ferrari, L. (Researcher)
Biotechnology and Biological Sciences Research Council
1/09/11 → 31/12/14
Project: Standard
Activities
-
Methods and Applications of Crystal Structure Prediction
Mitchell, J. B. O. (Participant)
11 Jul 2018 → 13 Jul 2018Activity: Participating in or organising an event types › Participation in or organising a conference
-
Sutton Trust Summer School
Mitchell, J. B. O. (Invited speaker)
5 Jul 2018Activity: Talk or presentation types › Presentation
File -
International Science Summer School
Mitchell, J. B. O. (Invited speaker)
3 Jul 2018Activity: Talk or presentation types › Presentation
File -
ScotCHEM Computational Chemistry Symposium 2018
Mitchell, J. B. O. (Organiser)
14 Jun 2018 → 15 Jun 2018Activity: Participating in or organising an event types › Participation in or organising a conference
-
CECAM Solubility Prediction Workshop.
Mitchell, J. B. O. (Participant)
14 May 2018 → 15 May 2018Activity: Participating in or organising an event types › Participation in or organising a conference
Prizes
-
iGEM Bronze Medal
Mitchell, J. B. O. (Recipient), Melo Czekster, C. (Recipient), Stokes, V. A. (Recipient), Ferreira, H. C. (Recipient) & Hooley, C. A. (Recipient), 28 Oct 2018
Prize: Prize (including medals and awards)
-
-
-
-
iGEM Gold Medal
Mitchell, J. B. O. (Recipient), Smith, V. A. (Recipient), Melo Czekster, C. (Recipient), Schwarz-Linek, U. (Recipient), Hooley, C. (Recipient) & Bentley, K. (Recipient), 4 Nov 2019
Prize: Prize (including medals and awards)
Press/Media
-
-
Ethical Dilemmas That Artificial Intelligence Raises in the Lab
2/07/19
1 Media contribution
Press/Media: Relating to Research