Find_novel_viruses - ababaian/serratus GitHub Wiki

Finding novel viruses

We are often asked, "How do I find novel RNA viruses related to X?". These tutorial outlines multiple lines of attack for this question, in increasing difficulty and rigour.

NEW. palmID web-application

Provide a viral RNA-dependent RNA Polymerase sequence as input and the palmID web-application will perform an RdRP-palmprint lookup against the Serratus databases as described in Tutorial B below.

A. Serratus.io Lookup [15 m]

The easiest way to navigate to a novel virus is to use the serratus.io graphical web interface. "Click to discover".

B. RdRP palmprint sequence-search [2-4 h]

Sequence-based search with high specificity for novel viruses based on PalmDB sequences. "Script to discover".

C. RdRP microassembly sequence-search [4-8 h]

Sequence-based search with high sensitivity for novel viruses based on micro-assembled sequences. "Script to discover".

D. DIAMOND nr search on AWS EC2 [0.5 - 4h]

Searching against BLAST nr non-redudant protein database with DIAMOND on AWS EC2.

E. Virome-enrichment search [1 - 2d]

Query the entire Serratus data using a third variable (host-genome or metadata term) search.