Overview‐Metagenomics - iffatAGheyas/bioinformatics-tutorial-wiki GitHub Wiki

6.2.1 Overview & Experimental Design

Before diving into analysis, a clear experimental plan ensures that your metagenomic data will answer your biological questions.


A. Marker-Gene (Amplicon) vs. Shotgun Metagenomics

Feature Marker-Gene (16S/ITS) Shotgun Metagenomics
Target Conserved marker loci (16S rRNA for bacteria; ITS for fungi) Whole community DNA (all genes)
Cost per sample Low (hundreds of samples affordable) Moderate–high (deeper sequencing required)
Taxonomic resolution Genus to species (sometimes ambiguous) Species to strain (depending on coverage)
Functional insight None (marker only) Direct (gene content, pathways)
Analysis complexity Simpler pipelines (DADA2, QIIME2) More steps (assembly, binning, annotation)

Choose amplicon when you need broad surveys across many samples at low cost.
Choose shotgun when you require functional profiling, strain resolution, or genome recovery.


B. Key Study Design Considerations

  1. Replicates & Controls

    • Biological replicates (≥ 3 per group) to estimate natural variability.
    • Negative controls (e.g. blank extractions) to detect contamination.
    • Mock-community standards (commercial mixes) to benchmark accuracy.
  2. Sequencing Depth

    • Amplicon: 10 000–50 000 reads per sample typically sufficient for diversity metrics.
    • Shotgun:
      • Taxonomy only: ~5–10 M reads/sample.
      • Assembly & binning: ≥ 20–50 M reads/sample (higher for complex communities).
  3. Read Length & Technology

    • Short-reads (Illumina 2×150 bp) standard for both amplicon and shotgun.
    • Long-reads (PacBio HiFi, ONT) improve assembly and resolve repeats but at higher cost.
  4. Multiplexing & Barcoding

    • Use dual‐index barcodes to avoid index hopping (especially on patterned‐flowcell platforms).
    • Balance library concentrations to achieve even coverage across samples.
  5. Sample Collection & Preservation

    • Immediate stabilization (e.g. flash‐freeze, preservation buffer) to prevent community shifts.
    • Standardize collection protocols (same swab type, time of day, storage temperature).
  6. Batch Effects

    • Process groups of samples together (DNA extraction, library prep) to minimize technical bias.
    • Randomize sample order when sequencing large batches.

Pro Tip: Write down your entire workflow—from sample collection through sequencing—in a study design document (e.g. Markdown design.md) so that every parameter is tracked and reproducible.