MagentaFlow - quadram-institute-bioscience/gmh-sops GitHub Wiki

MagentaFlow

MagentaLogo

Metagenomics and Metaviromics pipeline

MagentaFlow is a workflow for the analysis of whole metagenome shotgun experiments (using Illumina Paired-End libraries).

It consists of:

  1. QC (fastp 0.20.1)
  2. Read-by-read analysis
    1. Taxonomic profiling with Kraken2 v2.0.8 using multiple databases:
      • RefSeq (Virus, Bacteria, Archea, Human), January 2019
      • GTDB (r89_54k), for Bacteria and Archea (link)
      • (optionally) GTDB, Bacteria only (link), 2018
    2. Taxonomic profiling using MetaPhlan 3.0
    3. Functional profiling using Humann 3.0
  3. De novo assembly (MEGAHIT v1.2.9)
    1. Gene prediction (Metaprokka 1.14.6c)
    2. Binning (using both MaxBin 2.2.7 and MetaBat 2.15, refining and combining the two using DAS_Tool 1.1.2)
    3. Functional analysis (eggNOG mapper 2.0.1)
  4. Backmapping (bbmap 38.57)
    1. Under costruction

Workflow

Workflow