Hypervirome - serratus-bio/open-virome GitHub Wiki

A solitary {Virome} graph is as a set of sequencing datasets sharing a common trait, and their collection of viruses.

A {{Hyper-Virome}} is a representation of a multiple {Viromes} and their interactions through common viruses. From each individual {Virome}, all runs (nodes) are aggregated into a single hyper-node, and all per-virus contigs (edges per virome) are aggregated into a single hyper-edge.

Hyper Virome Icon

For example, if each virome {A, B, C} each contain virus X (black outline), then each node {A}, {B}, {C} will contain a hyper-edge to a single virus X node (<X>). The hyper-edge between {A} and <X> will be the aggregation of the 4 contigs from runs in {A}. Therefore, a {{Hyper-Virome}} captures the interactions between each {Virome}.

The {{Hyper-Virome}} Graph

The {{Hyper-Virome}} is represented as a weighted undirected, bipartite graph where:

  • Virome Node (square): a set of sequencing runs and their viruses as a {Virome}.
  • Virus Node (hexagon): an abstract unit of virus, defined here as species-like Operational Taxonomic Units (sOTU) of RNA viruse (See: palmDB)
  • Edge (solid line): the aggregated set of contigs within the virome matching the virus
  • Edge (weight): line thickness is scaled by virome-virus 'Vrank', or a measure of importance within the virome

image

The {{SRA Genus}} spans 236,788 sequencing datasets, which make up 4,451 individual {Genus} viromes (one per taxonomic genus). The 71,497 RNA virus nodes interlink the viromes with 176,902 edges, the summary of 819,696 individual contigs.