Advancing Bayesian network algorithms for inferring gene regulation using an integrative computational-bilogical approach in a yeast model system

Project: Standard

Project Details

Key findings

Novel methodology:
We developed a novel methodology for predicting protein expression from mRNA data, incorporating ribosome density, ribosome occupancy, codon usage, gene copy number, and mRNA free folding energy alongside mRNA measurements in a generalized linear model to learn a predictive model for protein expression. Two key assumptions underlie this methodology: (1) additional, unknown factors relating mRNA to protein will be similar for proteins involved in the same biological process, and (2) it is possible to learn this relationship using protein and mRNA levels collected from “control” conditions, the same across multiple experiments. We then build separate generalised linear models for individual functional groups of proteins (e.g., pathways) using multiple mRNA experiments plus the other information listed above to predict protein expression in the “control” condition.

We applied this methodology to budding yeast, S. cerevisiae (below). For others to apply this methodology to their own system what is required is: (1) the additional genetic information about each gene (e.g., ribosome density and occupancy, etc.), (2) categorisation of genes into functional processes (e.g., KEGG pathways), (3) at least one, preferably more, high-throughput quantitative protein expression datasets taken under the relevant “control” conditions, and (4) multiple high-throughput quantitative mRNA expression datasets containing the same “control” conditions.

Predictive models:
We developed predictive models for protein expression from mRNA expression for 38 KEGG pathways (all pathways containing enough proteins for learning represented in the protein expression datasets) in budding yeast, S. cerevisiae.

Evolved synthetic constructs:
We evolved a single ancestral synthetic construct of budding yeast, S. cerevisiae, (with GFP-tagged osmostic stress response protein TPS2) to mild osmotic stress produced by liquid media containing 0.3M NaCl. The construct was evolved in 16 replicates, 8 each in either shaking or static culture, for 90-150 days (600-1000 generations) with stocks saved every 3 days (20 generations), resulting in 480+ “snapshots” of evolution.
AcronymBB/F001398/1 Bayesian Network
Effective start/end date15/02/0814/02/12


  • BBSRC: £563,884.52

UN Sustainable Development Goals

In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This project contributes towards the following SDG(s):

  • SDG 3 - Good Health and Well-being


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.