Journal Entry ‐ GESA - bcb420-2024/Anna_Lai GitHub Wiki
GSEA Analysis
Date March 19, 2024
Steps
- Download the file MesenvsImmuno_RNASeq_ranks.rnk from GitHub.
- Open the GESA Offical Website and download the software for analysis. The version GSEA_Win_4.3.3 was downloaded.
- Open the software, go to run GESA Preranked, expand on the basic fields, and change the max size to 200, min size to 15.
Note there is a recent update to the Barder Lab Geneset.
Parameters used:
- Gene set = Human_GOBP_AllPathways_noPFOCR_no_GO_iea_March_01_2024_symbol.gmt Can be found here: Link to data folder
- Max size = 200
- Min size = 15
- gene set permutation = 1000
- Collapse/ Remap to gene symbols = no collapse
Errors encountered
After pruning, none of the gene sets passed size thresholds.
<Error Details>
---- Full Error Message ----
Multiple rows mapped to the symbol ''RRM2'. This is not allowed in Remap_only m ...
---- Stack Trace ----
# of exceptions: 1
------Multiple rows mapped to the symbol ''RRM2'. This is not allowed in Remap_only mode.------
xtools.api.param.BadParamException: Multiple rows mapped to the symbol ''RRM2'. This is not allowed in Remap_only mode.
at org.gsea_msigdb.gsea/edu.mit.broad.genome.alg.DatasetGenerators.collapse(DatasetGenerators.java:194)
at org.gsea_msigdb.gsea/xtools.gsea.GseaPreranked.getRankedList(GseaPreranked.java:175)
at org.gsea_msigdb.gsea/xtools.gsea.GseaPreranked.execute(GseaPreranked.java:97)
at org.gsea_msigdb.gsea/edu.mit.broad.xbench.tui.TaskManager$ToolRunnable.run(TaskManager.java:391)
at java.base/java.lang.Thread.run(Unknown Source)
Final Result
Questions and Answers
- Explain the reasons for using each of the above parameters. We want to limit the result to a reasonable range for inspection - 200 max. 1000 is the default for permutation. The chipset was chosen because the gene name was used as an identifier.
- What is the top gene set returned for the Mesenchymal sub type?
- What is its pvalue, ES, NES and FDR associated with it. Score (ES): 0.8634064, Normalized Enrichment Score (NES): 2.5887244
- How many genes in its leading edge? 3296 / 5931
- What is the top gene associated with this geneset. The top two are: HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION, EXTRACELLULAR MATRIX ORGANIZATION (GOBP)
- 1500 gene sets are significant at FDR < 25%
- 862 gene sets are significantly enriched at nominal pvalue < 1%
- 1204 gene sets are significantly enriched at nominal pvalue < 5%
- What is the top gene set returned for the Immunoreactive subtype?
- What is its pvalue, ES, NES and FDR associated with it.
- How many genes in its leading edge? 2635 / 5931 gene sets
- What is the top gene associated with this geneset. The top two are: HALLMARK_INTERFERON_ALPHA_RESPONSE, HALLMARK_INTERFERON_GAMMA_RESPONSE
- 1475 gene sets are significantly enriched at FDR < 25%
- 811 gene sets are significantly enriched at nominal pvalue < 1%
- 1173 gene sets are significantly enriched at nominal pvalue < 5%
NAME | TITLE | SCORE |
---|---|---|
IGDCC3 | immunoglobulin superfamily DCC subclass member 3 [Source:HGNC Symbol;Acc:HGNC:9700] | 36.329580 |
ANTXR1 | ANTXR cell adhesion molecule 1 [Source:HGNC Symbol;Acc:HGNC:21014] | 35.479492 |
AEBP1 | AE binding protein 1 [Source:HGNC Symbol;Acc:HGNC:303] | 33.191590 |
References (if any)
https://baderlab.github.io/Cytoscape_workflows/EnrichmentMapPipeline/Protocol2_createEM.html https://baderlab.github.io/CBW_Pathways_2023/gsea_mod3.html