Journal Entry: GSEA - bcb420-2022/Emiliya_Stolyarova GitHub Wiki

Started: March 19, 2022. Completed: March 22, 2022.

I have tried to run the docker image including GSEA, however I received an error when trying to do so. When retreiving genesets from the baderlab geneset collection using RStudio, I made sure to specify the date of the release to be March 01, 2021. For this analysis, I have used the GSEA Software for Windows (Mootha et al., 2003)(Subramanian et al., 2005).

Explain the reasons for using each of the above parameters.

The file studied is a ranked list of genes of the mesenchymal and immunoreactive subtypes. The parameters are set to include GO biological process annotations, but to exclude electronic annotations. The geneset size is set from 15 to 200. Having the maximum be set to 200 allows for finding resulting pathways which are smaller in size and are thus more relevant towards the genes of interest. The minimum is set to 15 to exclude genesets which are too small and would not be very biologically relevant. The gene set permutation paramater needs to specified in order to analyze the ranked list.

Top gene set returned for the Mesenchymal subtype

The top geneset returned for the Mesenchymal subtype can be found in the enrichment results of the genes with positive scores. The name of the top geneset is "HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION%MSIGDB_C2%HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION".
pvalue: 0.0
ES: 0.8635254
NES: 2.5625956
FDR: 0.0
There are genes 57% of Mesenchymal subtype genes from the gene list in its leading edge.
The top gene of this geneset is FBN1.

Top gene set returned for the Immunoreactive subtype

The top geneset returned for the Immunoreactive subtype can be found in the enrichment results of the genes with negative scores. The name of the top geneset is "HALLMARK_INTERFERON_ALPHA_RESPONSE%MSIGDB_C2%HALLMARK_INTERFERON_ALPHA_RESPONSE".
pvalue: 0.0
ES: -0.85694104
NES: -2.9393806
FDR: 0.0
There are genes 73% of Immunoreactive subtype genes from the gene list in its leading edge.
The top gene of this geneset is PROCR.

References

Mootha, V. K., Lindgren, C. M., Eriksson, K. F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstråle, M., Laurila, E., Houstis, N., Daly, M. J., Patterson, N., Mesirov, J. P., Golub, T. R., Tamayo, P., Spiegelman, B., Lander, E. S., Hirschhorn, J. N., Altshuler, D., … Groop, L. C. (2003). PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature genetics, 34(3), 267–273. https://doi-org.myaccess.library.utoronto.ca/10.1038/ng1180

Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R., Lander, E. S., & Mesirov, J. P. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America, 102(43), 15545–15550. https://doi-org.myaccess.library.utoronto.ca/10.1073/pnas.0506580102