Overview‐Metagenomics - iffatAGheyas/bioinformatics-tutorial-wiki GitHub Wiki
6.2.1 Overview & Experimental Design
Before diving into analysis, a clear experimental plan ensures that your metagenomic data will answer your biological questions.
A. Marker-Gene (Amplicon) vs. Shotgun Metagenomics
Feature | Marker-Gene (16S/ITS) | Shotgun Metagenomics |
---|---|---|
Target | Conserved marker loci (16S rRNA for bacteria; ITS for fungi) | Whole community DNA (all genes) |
Cost per sample | Low (hundreds of samples affordable) | Moderate–high (deeper sequencing required) |
Taxonomic resolution | Genus to species (sometimes ambiguous) | Species to strain (depending on coverage) |
Functional insight | None (marker only) | Direct (gene content, pathways) |
Analysis complexity | Simpler pipelines (DADA2, QIIME2) | More steps (assembly, binning, annotation) |
Choose amplicon when you need broad surveys across many samples at low cost.
Choose shotgun when you require functional profiling, strain resolution, or genome recovery.
B. Key Study Design Considerations
-
Replicates & Controls
- Biological replicates (≥ 3 per group) to estimate natural variability.
- Negative controls (e.g. blank extractions) to detect contamination.
- Mock-community standards (commercial mixes) to benchmark accuracy.
-
Sequencing Depth
- Amplicon: 10 000–50 000 reads per sample typically sufficient for diversity metrics.
- Shotgun:
- Taxonomy only: ~5–10 M reads/sample.
- Assembly & binning: ≥ 20–50 M reads/sample (higher for complex communities).
-
Read Length & Technology
- Short-reads (Illumina 2×150 bp) standard for both amplicon and shotgun.
- Long-reads (PacBio HiFi, ONT) improve assembly and resolve repeats but at higher cost.
-
Multiplexing & Barcoding
- Use dual‐index barcodes to avoid index hopping (especially on patterned‐flowcell platforms).
- Balance library concentrations to achieve even coverage across samples.
-
Sample Collection & Preservation
- Immediate stabilization (e.g. flash‐freeze, preservation buffer) to prevent community shifts.
- Standardize collection protocols (same swab type, time of day, storage temperature).
-
Batch Effects
- Process groups of samples together (DNA extraction, library prep) to minimize technical bias.
- Randomize sample order when sequencing large batches.
Pro Tip: Write down your entire workflow—from sample collection through sequencing—in a study design document (e.g. Markdown
design.md
) so that every parameter is tracked and reproducible.