Information-based summary statistics for spatial genetic structure inference

Xinghu Qin*, Oscar E. Gaggiotti*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Downloads (Pure)


The measurement of biodiversity at all levels of organization is an essential first step to understand the ecological and evolutionary processes that drive spatial patterns of biodiversity. Ecologists have explored the use of a large range of different summary statistics and have come to the view that information-based summary statistics, and in particular so-called Hill numbers, are a useful tool to measure biodiversity. Population geneticists, on the other hand, have focused largely on summary statistics based on heterozygosity and measures of allelic richness. However, recent studies proposed the adoption of information-based summary statistics in population genetics studies. Here, we performed a comprehensive assessment of the power of this family of summary statistics to inform regarding spatial patterns of genetic diversity and we compared it with that of traditional population genetics approaches, namely measures based on allelic richness and heterozygosity. To give an unbiased evaluation, we used three machine learning methods to test the performance of different sets of summary statistics to discriminate between spatial scenarios. We defined three distinct sets, (i) one based on allelic richness measures which included the Jaccard index, (ii) a set based on heterozygosity that included FST and (iii) a set based on Hill numbers derived from Shannon entropy, which included the recently proposed Shannon differentiation, ΔD. The results showed that the last of these performed as well or, under some specific spatial scenarios, even better than the traditional population genetics measures. Interestingly, we found that a rarely or never used genetic differentiation measure based on allelic richness, Jaccard dissimilarity (J), showed the highest discriminatory power to discriminate among spatial scenarios, followed by Shannon differentiation ΔD. We concluded, therefore, that information-based measures as well as Jaccard dissimilarity represent excellent additions to the population genetics toolkit.
Original languageEnglish
Article number13606
Pages (from-to)2183-2195
Number of pages13
JournalMolecular Ecology Resources
Issue number6
Early online date24 Mar 2022
Publication statusPublished - 1 Aug 2022


  • Information-based statistics
  • Population genetics
  • Spatial structure


Dive into the research topics of 'Information-based summary statistics for spatial genetic structure inference'. Together they form a unique fingerprint.

Cite this