Assignment #2 - bcb420-2023/Helena_Jovic GitHub Wiki

Objective

  • For Assignment 2, the goal is to identify the pathways that are linked with genes that are significantly upregulated or downregulated in people living with HIV using data in the study "Loss of skin and mucosal CXCR3+ resident memory T cells causes irreversible tissue-confined immunodeficiency in HIV".
  • Perform differential gene expression analysis comparing different tissue samples.
  • Perform ORA on upregulated genes and downregulated genes.

Introduction

In the previous assignment, I sourced, cleaned and normalized dataset GSE184320. This data Includes 16 CD45+ cell samples sorted by type of tissue: skin and peripheral blood mononuclear cells. After cleaning, mapping and normalizing the data, 20% of the original data set remains for a total of 11895 genes.

The experiment describes the impact of antiretroviral therapy (ART) on skin tissue-resident memory T (Trm) cells in people living with HIV (PLWH). The authors found that late ART initiation leads to permanent depletion of skin CD45+ Trm cells, while early ART can reconstitute the pool of Trm cells lost in early HIV infection. They also found that PLWH receiving late ART treatment had a loss of CXCR3+ Trm cells and a tolerogenic skin immune environment. Additionally, HPV-induced precancerous lesion biopsies showed reduced CXCR3+ Trm cell frequencies in the mucosa in PLWH compared to HIV-negative individuals. These findings suggest that the irreversible loss of CXCR3+ Trm cells in skin and mucosa of PLWH who received late ART treatment may be a contributing factor in the development of HPV-related cancer.

Reference: Saluzzo S, Pandey RV, Gail LM, Dingelmaier-Hovorka R et al. Delayed antiretroviral therapy in HIV-infected individuals leads to irreversible depletion of skin- and mucosa-resident memory T cells. Immunity 2021 Dec 14;54(12):2842-2858.e5. PMID: 34813775

Time Management

Date Started: March 12, 2023
Date Completed: March 14, 2023
Estimated Time: 10 hours
Actual Time: 17 hours

Workflow

  1. Loaded normalized data set from previous assignment. This was loaded using the function read.csv("normalized_data.csv").
  2. Identified which factor to consider in the model, based on the previously generated MDS plot in Assignment 1, the tissue type of CD45+ cells, specifically SKIN and PBMC seem to cluster more than any other factors. We can visualize this using the plotMDS function from the limma package.
  3. Created heatmap (without any p-value cutoffs) using pheatmap. Included appropriate figure headings and legends.
  4. Calculated p-values for differential expression, linear modelling based on tissue type.
  5. Applied Multiple Hypothesis Testing using Benjamini Hochberg with to adjust p-values.
  6. Set p-value threshold to 0.01.
  7. Created a heatmap, taking into account the threshold for p-value.
  8. Generated a list of genes that are upregulated/downregulated, that passed the threshold of 0.01.
  9. Used g:Profiler to perform ORA on upregulated genes and downregulated genes. Used function gost and publish_gostplot and added annotations to highlight the top terms.
  10. Re-read paper associated with dataset and interpreted the results of the analysis

Issues and Resolutions

  • Ran into issues when trying to create a heatmap, with regards to opening the temporary file being created. Resolution, is to try running the code on the Docker instead and see if I still get the same error. Tried using different ways to create the heatmap, all ran into the same erorr.

Results

  • With a p-value cutoff of 0.01, 2632 genes were found to be differentially expressed. After correction with the BH method at the same cutoff, 1795 genes remained.
  • For upregulated genes, the domain size was 445. Upregulated genes were involved key processes such as "epidermis development" and "skin development" and "extracellular region".
  • For downregualted genes, the domain size was 1465. Downregulated genes were involved in "adaptive immune response", "regulation of immune system process", and "leukocyte activation".

Questions and Answers

Differential Gene Expression Analysis

  1. Calculate p-values for each of the genes in your expression set. How many genes were significantly differentially expressed? What thresholds did you use and why?

With a cutoff of 0.05, there were 2859 (after correction) and I considered this as excessive, so I decided to make the cutoff more strict. I could have made the cutoff higher to 0.001; however, I was afraid I would disregard interesting genes, so I kept the cutoff at 0.01 as it is a generally accepted p-value for high significance.

  1. Multiple hypothesis testing - correct your p-values using a multiple hypothesis correction method. Which method did you use? And Why? How many genes passed correction?

I corrected for multiple hypotheses using the Benjamini-Hochberg method. I chose this method because it’s better minimizing false positives, which is critical in clinical samples. Since the sample size wasn’t huge, I didn’t go with Bonferroni or other methods that might be too strict and exclude significant discoveries. There were 2672 genes that passed correction.

ORA

  1. Which method did you choose and why?

The decision to use GProfiler was based on its compatibility with the HGNC symbols in the dataset and its up-to-date nature. Additionally, GProfiler offers user-friendly tools for visualization through both its web server and R library, making it easy to navigate and analyze the results.

  1. How many genesets were returned with what thresholds?
  • For upregulated genes, the domain size was 445.
  • For downregualted genes, the domain size was 1465.
  1. Run the analysis using the up-regulated set of genes, and the down-regulated set of genes separately. How do these results compare to using the whole list (i.e all differentially expressed genes together vs. the up-regulated and down regulated differentially expressed genes separately)?
  • Up-regulated terms: The terms "extracellular region", "extracellular space", "extracellular vesicle", and "extracellular organelle" are all related to cellular components and structures that are located outside of the cell membrane, in the extracellular space. The terms "epidermis development" and "skin development" are related to the biological processes that are involved in the development and maintenance of the epidermis and skin tissues, respectively. Overall, these terms are all related to extracellular structures and biological processes that play important roles in various aspects of organismal development, homeostasis, and function.
  • Down-regulated terms: The terms "immune system process", "immune response", "immune system", "adaptive immune response", "regulation of immune system process", and "leukocyte activation" all pertain to the immune system, which is responsible for protecting the body against pathogens like viruses and bacteria. The "immune system" is a complex network of cells, tissues, and organs that work together to identify and eliminate foreign invaders while maintaining tolerance to self. "Immune system process" refers to any process involved in the immune system's function, such as antigen presentation, lymphocyte activation, and cytokine signaling. "Immune response" specifically refers to the immune system's recognition and response to a pathogen or foreign substance. "Adaptive immune response" is triggered by exposure to a particular pathogen or antigen, resulting in the production of antigen-specific antibodies and immune memory. "Regulation of immune system process" involves various mechanisms that control and modulate the immune system, including negative feedback loops and regulatory T cells. "Leukocyte activation" is the process by which white blood cells are activated to respond to a pathogen or foreign substance, which involves a complex interplay of signaling pathways and cellular interactions.
  • Overall, the enriched terms which include the whole list has very little difference with the down-regulated terms, where each term relates to the immune system.

Interpretation

  1. Do the over-representation results support conclusions or mechanism discussed in the original paper?

The GProfiler results confirm the dysregulation of immune processes in the skin of PLWH (HIVSeq cohort) discussed in the study. The downregulation of CXCR3 expression and Th1-like T cell-related genes, which are important for anti-viral immunity, suggests a shift towards an anti-inflammatory and tolerogenic environment that may promote cancer. Additionally, the overexpression of genes involved in the regulation of immune effector processes, natural killer cell-mediated immunity, and IL-12 production suggests a tolerogenic or suppressive T helper cell phenotype, further supporting the notion of immune dysregulation in the skin of PLWH.

Regarding the upregulated genes "epidermis development," "tissue development," "skin development," "extracellular region," "extracellular space," and "extracellular exosome," the study does not directly discuss their role in the dysregulation of the immune system in PLWH skin. However, these genes may be related to the development and maintenance of the skin microenvironment that supports Trm cell function. The study highlights the importance of Trm cells in promoting tissue and immune homeostasis in the skin, protecting against microbes and cancer, and playing a role in communication with other cells in the skin microenvironment.

Overall, the GProfiler results provide further evidence of immune dysregulation in the skin of PLWH, with a shift towards a tolerogenic and suppressive immune phenotype. While the study does not directly discuss the upregulated genes in relation to this dysregulation, they may play a role in the development and maintenance of the skin microenvironment that supports Trm cell function, which is crucial for tissue and immune homeostasis in the skin.

  1. Can you find evidence, i.e. publications, to support some of the results that you see. How does this evidence support your results.

In summary, the gene expressions common to support the results from the GProfiler analysis suggest a dysregulation of immune processes in the skin of PLWH, with a shift towards a tolerogenic and suppressive immune phenotype. Downregulation of CXCR3 expression and Th1-like T cell-related genes indicate a decreased capacity for anti-viral immunity, while upregulated genes related to skin and tissue development may contribute to the maintenance of the skin microenvironment that supports Trm cell function. Overall, these findings provide insights into the immune dysregulation in the skin of PLWH and suggest potential targets for therapeutic intervention.

References

  • Isserlin, Ruth. (2023). Week 6 Differential Gene Expression Analysis. University of Toronto.

  • Isserlin, Ruth. (2023). Week 8 Annotation Resources and Simple Enrichment. University of Toronto.

  • Geistlinger L, Csaba G, Santarelli M, Ramos M, Schiffer L, Turaga N, Law C,Davis S, Carey V, Morgan M, Zimmer R, Waldron L. Toward a gold standard for benchmarking gene set enrichment analysis. Brief Bioinform. 2020 Feb 6 

  • Saluzzo S, Pandey RV, Gail LM, Dingelmaier-Hovorka R et al. Delayed antiretroviral therapy in HIV-infected individuals leads to irreversible depletion of skin- and mucosa-resident memory T cells. Immunity 2021 Dec 14;54(12):2842-2858.e5. PMID: 34813775

  • Chimbetete, T., Buck, C., Choshi, P., Selim, R., Pedretti, S., Divito, S. J., … Peter, J. (2023). HIV-Associated Immune Dysregulation in the Skin: A Crucible for Exaggerated Inflammation and Hypersensitivity. Journal of Investigative Dermatology, 143(3), 362–373. doi:10.1016/j.jid.2022.07.035