Supplementary Material and supporting data and code for "The discovery, distribution and diversity of DNA viruses associated with Drosophila melanogaster in Europe"
Drosophila melanogaster is an important model for antiviral immunity in arthropods, but very few DNA viruses have been described from the family Drosophilidae. This has limited our opportunity to use natural host-pathogen combinations in experimental studies, and may have biased our understanding of the Drosophila virome. In the associated study we report fourteen DNA viruses detectable by metagenomic analysis of six and a half thousand pool-sequenced Drosophila, sampled from 47 European locations between 2014 and 2016.
File S1: Excel spreadsheet detailing collection dates and locations
File S2: Text document listing the microorganisms including in the ‘Drosophila microbiome’ mapping reference, and the mitochondrial and plastid sequences included in the species diagnostic mapping reference.
File S3: Excel spreadsheet detailing the mapped read numbers from the DrosEU data. Sheet A gives raw mapped read counts, Sheet B gives counts normalised to read length in reads per kilobase per million reads (RPKM), Sheet C gives raw counts of reads mapping to additional Species-diagnostic loci.
File S4: DNA Fasta file of virus fragments thought to be associated with contaminating taxa
File S5: Excel spreadsheet detailing the presence and read counts of DNA viruses in DrosEU datasets. Sheet A gives counts normalised to the fly to give virus copy number in genomes fly genome and estimated prevalence at three different detection thresholds, Sheet B provides metadata used for the statistical analysis.
File S6: DNA fasta file of assembled Vesanto virus segments, including divergent segments and segments assembled from public datasets.
File S7: Excel spreadsheet detailing the presence and read counts of DNA viruses in 28 publicly available Drosophila sequencing projects. Sheet 1 summarises the public datasets included, Sheet 2 gives raw mapped read counts
File S8: Excel spreadsheet detailing mean and total πA, πS and πA/πS for each gene (sheet A) and the number of synonymous and non-synonymous SNPs in the genome of Kallithea virus, Linvill Road virus and Vesanto virus (sheet B).
File S9: Figure showing A) variation in nucleotide diversity across non-coding and synonymous sites in the Kallithea virus genome, plotted as a sliding window with two window sizes, and B) the percentage of Kallithea virus infected samples that showed evidence of an indel. Intergenic regions of the genome are coloured in grey. A chi-square test for independence found a strong positive association between intergenic regions and InDels (X-squared = 3236, df = 1, p-value < 2.2e-16).
File S10: DNA fasta file of exemplar Galbut virus sequences aligned with the EVE and Gypsy-like LTR retroelement 297.
Folder S1: BASH Workflows outlining read mapping, de novo assembly, and sequence similarity searches to identify viruses
Folder S2: R Workflows and data files for prevalence and diversity analyses
Folder_S3: BEAST xml files and parameter logs for analyses of the Galbut virus EVE
Folder_S4: Alignments and tree files for phylogenetic relationships
Date made available | 17 Mar 2021 |
---|
Publisher | Figshare |
---|