GSEA - bcb420-2024/Krutika_Joshi GitHub Wiki

Steps taken to perform GSEA using R

  1. Downloaded the ranks of mesenchymal and immunoreactive genes from here
  2. Downloaded the genesets from Bader lab. I used the gene set titled Human_GOBP_AllPathways_noPFOCR_no_GO_iea_March_01_2024_symbol.gmt.
  3. Then I followed the steps noted to run GSEA using R.
  4. I set run_gsea <- FALSE to TRUE
  5. I skipped step 5.3 from the instructions as I had manually downloaded the latest pathway definition file.
  6. Lastly, I ran GSEA using the required parameters from the Journal entry(i.e maximum geneset size of 200, minimum geneset size of 15 and gene set permutation set to 1000)

Question/Answers:

1. Question One

Maximum geneset size of 200

  • Having a max gene set of size 200 has multiple advantages such as GSEA prioritizing more biologically relevant gene sets. This occurs as large gene sets can contain functions that are diverse, which will make interpreting results difficult. Thus, decreasing the max geneset size allows to focus on relevant gene sets. In addition, it also allows GSEA to run faster thus saving time and resources. And lastly, this decreases the risk of false positive results.

Minimum geneset size of 15

  • Having a min gene set of sizes 15 allows to ensure that the pathway being examine is of biological relevance. Having a minimum values ensures that GSEA filters out small gene sets that are insignificant and cause noise to the results.

Gene set permutation

  • Gene set permutation allow for the correction of multiple hypothesis testing. Since we are testing multiple pathways simultaneously, it is important to correct for this as this will decrease the risk of false positive results.

2. Question Two

Mesenchymal sub type

  • top gene set: HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION%MSIGDBHALLMARK%HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION
  • pvalue: 0.0
  • ES: 0.8653797
  • NES: 2.5404062
  • FDR: 0.0
  • genes: 146
  • top gene: FBN1

Immunoreactive subtype

  • top gene set - HALLMARK_INTERFERON_ALPHA_RESPONSE%MSIGDBHALLMARK%HALLMARK_INTERFERON_ALPHA_RESPONSE
  • pvalue: 0.0
  • ES: -0.8557666
  • NES: -2.903968
  • FDR: 0.0
  • genes: 79
  • top gene: PROCR