TY - JOUR
T1 - Constrained models of evolution lead to improved prediction of functional linkage from correlated gain and loss of genes
AU - Barker, Daniel
AU - Meade, A
AU - Pagel, M
PY - 2007/1/1
Y1 - 2007/1/1
N2 - Motivation: We compare phylogenetic approaches for inferring functional gene links. The approaches detect independent instances of the correlated gain and loss of pairs of genes from species' genomes. We investigate the effect on results of basing evidence of correlations on two phylogenetic approaches, Dollo parsminony and maximum likelihood (ML). We further examine the effect of constraining the ML model by fixing the rate of gene gain at a low value, rather than estimating it from the data.Results: We detect correlated evolution among a test set of pairs of yeast (Saccharomyces cerevisiae) genes, with a case study of 21 eukaryotic genomes and test data derived from known yeast protein complexes. If the rate at which genes are gained is constrained to be low, ML achieves by far the best results at detecting known functional links. The model then has fewer parameters but it is more realistic by preventing genes from being gained more than once.Availability: BayesTraits by M. Pagel and A. Meade, and a script to configure and repeatedly launch it by D. Barker and M. Pagel, are available at http://www.evolution.reading.ac.uk.
AB - Motivation: We compare phylogenetic approaches for inferring functional gene links. The approaches detect independent instances of the correlated gain and loss of pairs of genes from species' genomes. We investigate the effect on results of basing evidence of correlations on two phylogenetic approaches, Dollo parsminony and maximum likelihood (ML). We further examine the effect of constraining the ML model by fixing the rate of gene gain at a low value, rather than estimating it from the data.Results: We detect correlated evolution among a test set of pairs of yeast (Saccharomyces cerevisiae) genes, with a case study of 21 eukaryotic genomes and test data derived from known yeast protein complexes. If the rate at which genes are gained is constrained to be low, ML achieves by far the best results at detecting known functional links. The model then has fewer parameters but it is more realistic by preventing genes from being gained more than once.Availability: BayesTraits by M. Pagel and A. Meade, and a script to configure and repeatedly launch it by D. Barker and M. Pagel, are available at http://www.evolution.reading.ac.uk.
KW - PROTEIN-PROTEIN INTERACTIONS
KW - MAXIMUM-LIKELIHOOD APPROACH
KW - ANCESTRAL CHARACTER STATES
KW - DISCRETE CHARACTERS
KW - SACCHAROMYCES-CEREVISIAE
KW - PHYLOGENETIC PROFILES
KW - GENOME
KW - DNA
KW - CONSERVATION
KW - DIVERGENCE
UR - http://www.scopus.com/inward/record.url?scp=33845898805&partnerID=8YFLogxK
UR - http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btl558?ijkey=SuERXx4FZJkB3Zv&keytype=ref
U2 - 10.1093/bioinformatics/btl558
DO - 10.1093/bioinformatics/btl558
M3 - Article
SN - 1367-4803
VL - 23
SP - 14
EP - 20
JO - Bioinformatics
JF - Bioinformatics
IS - 1
ER -