TY - JOUR
T1 - Advances to Bayesian network inference for generating causal networks from observational biological data
AU - Yu, Jing
AU - Smith, Victoria Anne
AU - Wang, Paul P
AU - Hartemink, Alexander J
AU - Jarvis, Erich D
PY - 2004/12/12
Y1 - 2004/12/12
N2 - Motivation: Network inference algorithms are powerful computational tools for identifying putative causal interactions among variables from observational data. Bayesian network inference algorithms hold particular promise in that they can capture linear, non-linear, combinatorial, stochastic and other types of relationships among variables across multiple levels of biological organization. However, challenges remain when applying these algorithms to limited quantities of experimental data collected from biological systems. Here, we use a simulation approach to make advances in our dynamic Bayesian network (DBN) inference algorithm, especially in the context of limited quantities of biological data.Results: We test a range of scoring metrics and search heuristics to find an effective algorithm configuration for evaluating our methodological advances. We also identify sampling intervals and levels of data discretization that allow the best recovery of the simulated networks. We develop a novel influence score for DBNs that attempts to estimate both the sign (activation or repression) and relative magnitude of interactions among variables. When faced with limited quantities of observational data, combining our influence score with moderate data interpolation reduces a significant portion of false positive interactions in the recovered networks. Together, our advances allow DBN inference algorithms to be more effective in recovering biological networks from experimentally collected data.
AB - Motivation: Network inference algorithms are powerful computational tools for identifying putative causal interactions among variables from observational data. Bayesian network inference algorithms hold particular promise in that they can capture linear, non-linear, combinatorial, stochastic and other types of relationships among variables across multiple levels of biological organization. However, challenges remain when applying these algorithms to limited quantities of experimental data collected from biological systems. Here, we use a simulation approach to make advances in our dynamic Bayesian network (DBN) inference algorithm, especially in the context of limited quantities of biological data.Results: We test a range of scoring metrics and search heuristics to find an effective algorithm configuration for evaluating our methodological advances. We also identify sampling intervals and levels of data discretization that allow the best recovery of the simulated networks. We develop a novel influence score for DBNs that attempts to estimate both the sign (activation or repression) and relative magnitude of interactions among variables. When faced with limited quantities of observational data, combining our influence score with moderate data interpolation reduces a significant portion of false positive interactions in the recovered networks. Together, our advances allow DBN inference algorithms to be more effective in recovering biological networks from experimentally collected data.
KW - GENETIC NETWORKS
KW - EXPRESSION
UR - http://www.scopus.com/inward/record.url?scp=12344259602&partnerID=8YFLogxK
UR - http://bioinformatics.oxfordjournals.org/cgi/reprint/20/18/3594
U2 - 10.1093/bioinformatics/bth448
DO - 10.1093/bioinformatics/bth448
M3 - Article
SN - 1367-4803
VL - 20
SP - 3594
EP - 3603
JO - Bioinformatics
JF - Bioinformatics
IS - 18
ER -