Notes on ORA gold standards paper - bcb420-2024/Dien_Nguyen GitHub Wiki
Ludwig Geistlinger, Gergely Csaba, Mara Santarelli, Marcel Ramos, Lucas Schiffer, Nitesh Turaga, Charity Law, Sean Davis, Vincent Carey, Martin Morgan, Ralf Zimmer, Levi Waldron, Toward a gold standard for benchmarking gene set enrichment analysis, Briefings in Bioinformatics, Volume 22, Issue 1, January 2021, Pages 545–556, https://doi.org/10.1093/bib/bbz158
- Two predominantly used enrichment methods:
- Overrepresentation analysis (ORA): disproportionately many genes of significant expression change, only analyze genes that pass DE threshold.
- Gene set enrichment analysis (GSEA): genes accumulate at top/bottom of full gene vector ordered by direction and magnitude of change. Compute DE scores for all, then compute gene set scoring
- Network-based methods: evaluate measures of DE in context of known interactions
- An R/Bioconductor package which implements an executable benchmark framework fo systematic and reproducible assessment of gene set enrichment methods
- Goal: quantitative assessment of EA methods/algos, not tools
- Considers runtime and statistical significance.
- ORA for simple gene lists
- Pre-ranked GSEA/CAMERA for pre-ranked gene lists
- logTPMs for expression-based EA on full expression matrix
- ROAST/GSVA for self-contained null hypothesis
- PADOG/ORA for competitive null hypothesis
- SAFE for complex experimental design and directional hypothesis