Kodoja: a workflow for virus detection in plants using k-mer analysis of RNA-sequencing data

Amanda Baizan-Edge, Peter Cock, Stuart MacFarlane, Wendy McGavin, Lesley Torrance, Susan Jones

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)
5 Downloads (Pure)


RNA-sequencing of plant material allows for hypothesis-free detection of multiple viruses simultaneously. This methodology relies on bioinformatics workflows for virus identification. Most workflows are designed for human clinical data, and few go beyond sequence mapping for virus identification. We present a new workflow (Kodoja) for the detection of plant virus sequences in RNA-sequence data. Kodoja uses k-mer profiling at the nucleotide level and sequence mapping at the protein level by integrating two existing tools Kraken and Kaiju. Kodoja was tested on three existing RNA-seq datasets from grapevine, and two new RNA-seq datasets from raspberry. For grapevine, Kodoja was shown to be more sensitive than a method based on contig building and blast alignments (27 viruses detected compared to 19). The application of Kodoja to raspberry, showed that field-grown raspberries were infected by multiple viruses, and that RNA-seq can identify lower amounts of virus material than reverse transcriptase PCR. This work enabled the design of new PCR-primers for detection of Raspberry yellow net virus and Beet ringspot virus. Kodoja is a sensitive method for plant virus discovery in field samples and enables the design of more accurate primers for detection. Kodoja is available to install through Bioconda and as a tool within Galaxy.
Original languageEnglish
Pages (from-to)533-542
JournalJournal of General Virology
Early online date24 Jan 2019
Publication statusPublished - 1 Mar 2019


Dive into the research topics of 'Kodoja: a workflow for virus detection in plants using k-mer analysis of RNA-sequencing data'. Together they form a unique fingerprint.

Cite this