Nucleotide usage biases distort inferences of the species tree

Rui Borges, Bastien Boussau, Gergely J Szöllősi, Carolin Kosiol*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Downloads (Pure)


Despite the importance of natural selection in species' evolutionary history, phylogenetic methods that take into account population-level processes typically ignore selection. The assumption of neutrality is often based on the idea that selection occurs at a minority of loci in the genome and is unlikely to compromise phylogenetic inferences significantly. However, genome-wide processes like GC-bias and some variation segregating at the coding regions are known to evolve in the nearly neutral range. As we are now using genome-wide data to estimate species trees, it is natural to ask whether weak but pervasive selection is likely to blur species tree inferences. We developed a polymorphism-aware phylogenetic model tailored for measuring signatures of nucleotide usage biases to test the impact of selection in the species tree. Our analyses indicate that while the inferred relationships among species are not significantly compromised, the genetic distances are systematically underestimated in a node-height dependent manner: i.e., the deeper nodes tend to be more underestimated than the shallow ones. Such biases have implications for molecular dating. We dated the evolutionary history of 30 worldwide fruit fly populations, and we found signatures of GC-bias considerably affecting the estimated divergence times (up to 23%) in the neutral model. Our findings call for the need to account for selection when quantifying divergence or dating species evolution.

Original languageEnglish
Article numberevab290
Number of pages13
JournalGenome Biology and Evolution
Issue number1
Early online date4 Jan 2022
Publication statusPublished - Jan 2022


  • Species tree
  • Selection
  • Nearly neutral evolution
  • GC-bias
  • Molecular dating


Dive into the research topics of 'Nucleotide usage biases distort inferences of the species tree'. Together they form a unique fingerprint.

Cite this