Notes on ORA gold standards paper - bcb420-2024/Dien_Nguyen GitHub Wiki

Source

Ludwig Geistlinger, Gergely Csaba, Mara Santarelli, Marcel Ramos, Lucas Schiffer, Nitesh Turaga, Charity Law, Sean Davis, Vincent Carey, Martin Morgan, Ralf Zimmer, Levi Waldron, Toward a gold standard for benchmarking gene set enrichment analysis, Briefings in Bioinformatics, Volume 22, Issue 1, January 2021, Pages 545–556, https://doi.org/10.1093/bib/bbz158

Notes

  • Two predominantly used enrichment methods:
    • Overrepresentation analysis (ORA): disproportionately many genes of significant expression change, only analyze genes that pass DE threshold.
    • Gene set enrichment analysis (GSEA): genes accumulate at top/bottom of full gene vector ordered by direction and magnitude of change. Compute DE scores for all, then compute gene set scoring
  • Network-based methods: evaluate measures of DE in context of known interactions

GSEABenchmarkeR

  • An R/Bioconductor package which implements an executable benchmark framework fo systematic and reproducible assessment of gene set enrichment methods
  • Goal: quantitative assessment of EA methods/algos, not tools
  • Considers runtime and statistical significance.

Guidelines

  • ORA for simple gene lists
  • Pre-ranked GSEA/CAMERA for pre-ranked gene lists
  • logTPMs for expression-based EA on full expression matrix
  • ROAST/GSVA for self-contained null hypothesis
  • PADOG/ORA for competitive null hypothesis
  • SAFE for complex experimental design and directional hypothesis
⚠️ **GitHub.com Fallback** ⚠️