Consistency and identifiability of the polymorphism-aware phylogenetic models

Rui Borges, Carolin Kosiol

Research output: Contribution to journalArticlepeer-review

Abstract

Polymorphism-aware phylogenetic models (PoMo) constitute an alternative approach for species tree estimation from genome-wide data. PoMo builds on the standard substitution models of DNA evolution but expands the classic alphabet of the four nucleotide bases to include polymorphic states. By doing so, PoMo accounts for ancestral and current intra-population variation, while also accommodating population-level processes ruling the substitution process (e.g. genetic drift, mutations, allelic selection). PoMo has shown to be a valuable tool in several phylogenetic applications but a proof of statistical consistency (and identifiability, a necessary condition for consistency) is lacking. Here, we prove that PoMo is identifiable and, using this result, we further show that the maximum a posteriori (MAP) tree estimator of PoMo is a consistent estimator of the species tree. We complement our theoretical results with a simulated data set mimicking the diversity observed in natural populations exhibiting incomplete lineage sorting. We implemented PoMo in a Bayesian framework and show that the MAP tree easily recovers the true tree for typical numbers of sites that are sampled in genome-wide analyses.
Original languageEnglish
Article number110074
Pages (from-to)1-6
Number of pages6
JournalJournal of Theoretical Biology
Volume486
Early online date8 Nov 2019
DOIs
Publication statusPublished - 7 Feb 2020

Keywords

  • Polymorphism-aware models
  • Phylogenetics
  • Species tree estimation
  • Consistency
  • Identifiability

Fingerprint

Dive into the research topics of 'Consistency and identifiability of the polymorphism-aware phylogenetic models'. Together they form a unique fingerprint.

Cite this