Abstract
Identifying genomic regions targeted by positive selection has been a longstanding interest of evolutionary biologists. This objective was difficult to achieve until the recent emergence of Next Generation Sequencing, which is fostering the development of large-scale catalogs of genetic variation for increasing number of species. Several statistical methods have been recently developed to analyze these rich datasets but there is still a poor understanding of the conditions under which these methods produce reliable results. This study aims at filling this gap by assessing the performance of genome-scan methods that consider explicitly the physical linkage among SNPs surrounding a selected variant. Our study compares the performance of seven recent methods for the detection of selective sweeps (iHS, nSL, EHHST, xp-EHH, XP-EHHST, XPCLR and hapFLK). We use an individual-based simulation approach to investigate the power and accuracy of these methods under a wide range of population models under both hard and soft sweeps. Our results indicate that XPCLR and hapFLK perform best and can detect soft sweeps under simple population structure scenarios if migration rate is low. All methods perform poorly with moderate to high migration rates, or with weak selection and very poorly under a hierarchical population structure. Finally, no single method is able to detect both starting and nearly completed selective sweeps. However, combining several methods (XPCLR or hapFLK with iHS or nSL) can greatly increase the power to pinpoint the selected region.
Original language | English |
---|---|
Pages (from-to) | 89-103 |
Journal | Molecular Ecology |
Volume | 25 |
Issue number | 1 |
Early online date | 12 Oct 2015 |
DOIs | |
Publication status | Published - Jan 2016 |
Keywords
- Positive selection
- Haplotype structure
- Genome scan