Genomic analyses in Drosophila do not support the classic allopatric model of speciation (data & code)

Dataset

Description

Inferring demographic history and effective migration genome-wide in species pairs across Drosophila.

The analysis pipeline is as follows:
Sample Metdata: Random selection (if possible) of publically-available sequences to use from NCBI. This contains filtering of metadata to ensure only WGS datasets containing a single individual were used, as well as data that

Genome annotation via BRAKER2 with D. melangoaster proteins.

Read filtering, mapping and variant calling for all pairs.

Preprocessing of vcfs.

Summary statistic calculation (coverage, hetA, hetB, hetAB, absolute genetic divergence).

Filtering intronic intervals using BEDTOOLS.

Extraction of distribution of pairwise divergence.

Modelling via Mathematica notebooks.

Analysis of modelling results in R.

Please find the dataset containing all of our analysis results for SI, IM, IIM and SC models for 93 pairs in the file: Publish_Dataframe_SDistributionDemographicAnalyses_Oct2025.csv
Date made available2025
PublisherGitHub

Cite this