Projects per year
Abstract
1. With increasing application of pooled-sequencing approaches to population genomics robust methods are needed to accurately quantify allele frequency differences between populations. Identifying consistent differences across stratified populations can allow us to detect genomic regions under selection and that differ between populations with different histories or attributes. Current popular statistical tests are easily implemented in widely available software tools which make them simple for researchers to apply. However, there are potential problems with the way such tests are used,which means that underlying assumptions about the data are frequently violated.
2. These problems are highlighted by simulation of simple but realistic population genetic models of neutral evolution and the performance of different tests are assessed. We present alternative tests (including GLMs with quasibinomial error structure) with attractive properties for the analysis of allele frequency differences and re-analyse a published dataset.
3. The simulations show that common statistical tests for consistent allele frequency differences perform poorly, with high false positive rates. Applying tests that do not confound heterogeneity and main effects significantly improves inference. Variation in sequencing coverage likely produces many false positives and re-scaling allele frequencies to counts out of a common value or an effective sample size reduces this effect.
4. Many researchers are interested in identifying allele frequencies that vary consistently across replicates to identify loci underlying phenotypic responses to selection or natural variation in phenotypes. Popular methods that have been suggested for this task perform poorly in simulations. Overall, quasibinomial GLMs perform better and also have the attractive feature of allowing correction for multiple testing by standard procedures and are easily extended to other designs.
2. These problems are highlighted by simulation of simple but realistic population genetic models of neutral evolution and the performance of different tests are assessed. We present alternative tests (including GLMs with quasibinomial error structure) with attractive properties for the analysis of allele frequency differences and re-analyse a published dataset.
3. The simulations show that common statistical tests for consistent allele frequency differences perform poorly, with high false positive rates. Applying tests that do not confound heterogeneity and main effects significantly improves inference. Variation in sequencing coverage likely produces many false positives and re-scaling allele frequencies to counts out of a common value or an effective sample size reduces this effect.
4. Many researchers are interested in identifying allele frequencies that vary consistently across replicates to identify loci underlying phenotypic responses to selection or natural variation in phenotypes. Popular methods that have been suggested for this task perform poorly in simulations. Overall, quasibinomial GLMs perform better and also have the attractive feature of allowing correction for multiple testing by standard procedures and are easily extended to other designs.
Original language | English |
---|---|
Pages (from-to) | 1899-1909 |
Journal | Methods in Ecology and Evolution |
Volume | 8 |
Issue number | 12 |
Early online date | 15 Jun 2017 |
DOIs | |
Publication status | Published - Dec 2017 |
Keywords
- Allele frequency differences
- Quaibinomial GLM
- CMH-test
- Pool-seq
- Selection
- Experimental evolution
Fingerprint
Dive into the research topics of 'Identifying consistent allele frequency differences in studies of stratified populations'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Genomic Evolution in Real Time: Genomic evolution in real time: causes and consequences of an adaptive mutation in the wild
Bailey, N. W. (PI) & Ritchie, M. G. (CoI)
9/01/12 → 8/01/15
Project: Standard
-
Evolution of gene expression: Evolution of gene expression in response to sexual selection
Ritchie, M. G. (PI)
1/07/11 → 30/06/14
Project: Standard
Datasets
-
Identifying consistent allele frequency differences in studies of stratified populations (dataset - GitHub)
Wiberg, R. A. W. (Creator), Gaggiotti, O. E. (Creator), Morrissey, M. B. (Creator) & Ritchie, M. G. (Creator), GitHub, 2017
https://github.com/RAWWiberg/ER_PoolSeq_Simulations
Dataset
-
Identifying consistent allele frequency differences in studies of stratified populations (dataset - Dryad)
Wiberg, R. A. W. (Creator), Gaggiotti, O. E. (Creator), Morrissey, M. B. (Creator) & Ritchie, M. G. (Creator), Dryad, 2017
DOI: 10.5061/dryad.mn0tv, http://datadryad.org/resource/doi:10.5061/dryad.60k68
Dataset