Enrichment Analysis - GianlucaMattei/methyl.O GitHub Wiki

Enrichment of Methylated Regions:

Since the deregulation of one gene expression unlikely affects the final phenotype, the methylation profile effect should be evaluated as a whole in order to retrieve as small changes can cooperate to emerge in the final results. For this reason we implemented in methyl.O the possibility to perform enrichment analyses using 12 databases. Since the effects on gene expression depend on the overlapped regions, genes to query the databases can be chosen based on the annotated features. Therefore it is possible to query only the genes with DMRs overlapping their promoter, or their head, and so on, or a combination of the available features as well as all the genes returned by the annotation. Results from this analysis permit to detect the pathways where the differential methylation brings the major contributes to their perturbation. Moreover the genes from the resulting pathways can be extracted to further evaluate the correlation with methylation.

R:

The function annotatedDMRs2Enrichr is designed to easly perform enrichment analysis, based on the R package enrichR, using as input the annotated list object resulting from annotateDMRs. The second option permits to specify the features considered to select the genes to query the databases. Since the statistic is computed for each enriched pathway, other options define the parameters for filtering the results. Once set which statistic to use (stat.filter) between the p.value, the adj. p.value or the overlap between the queried genes and the genes of the pathway, the threshold can be defined (stat.thr). By default 12 different databases are used by default for the enrichment otherwise it is possible to select specific databases (db). The function plotDMRs2Enrichr permits to plot the results of annotatedDMRs2Enrichr. Two different types of plots can be returned: the barplot and the lollipop plot, both displaying the -log10 of p.value or adj. p.value of the enriched pathways. Therefore the smaller is the p.value the greater is the value shown in the plot. The barplot also returns additional information regarding the fraction of hyper and hypo-methylated genes among the genes that contributed to the enrichment of a specific pathway. This information can be used to better understand if a pathway is repressed, upregulated or just altered in the studied conditions. The function needs as inputs the enrichment results from annotatedDMRs2Enrichr and the annotation results from annotateDMRs. Other parameters permit to set the statistics, between the p.value, the adj. p.value or the overlap (stat), to show and sort the results, to set the number of pathways to display (n) and to set the plot type between barplot and lollipop (plot.type). The graphical parameters define the colors of the hyper and hypo-methylated genes (col.hyper; col.hypo) and the colors palette (among among hcl.pals()) for lollipop plot. Finally it is possible to plot vertical lines indicating the statistical threshold (thrs) and set their colors (thrs.col)

GUI:

The Methylation Enrichment tab performs the enrichment for annotated genes resulting from the Annotate Methylated Region tab returning both a table and a plot of results. Two different types of plots can be returned: the barplot and the lollipop plot, both displaying the number of genes found in the pathway or the -log10 of the p.value or the adj. p.value of the enriched pathways. Therefore the smaller is the p.value the greater er is the value shown in the plot. This value is also used to sort the results. The barplot also returns one additional information regarding the fraction of hyper and hypo-methylated genes among the genes which contributed to the enrichment of a specific pathway. This information can be used to better understand if a pathway is repressed, upregulated or just altered in the studied conditions. Moreover, in order to study how methylation can alter the expression of an enriched pathway, genes from each pathway can be extracted to study the correlation between methylation and expression. This can be done under the tab Methylation vs Expression: on the left panel the “Select Path for filtering Genes” shows a dropdown menu with the resulting pathways from the current analysis. The parameters for the enrichment analysis can be found on the left panel:

Command Description
Statistics for Results Filtering For each enriched pathway is computed a p. value and a adj. p.value, this parameter permits to set which statistic to use.
Select Annotation from where gene symbols are taken Select the differentially methylated features from where to pick the genes.
DBs to Query Select the database to query among the 12 available. If left blank, all 12 databases will be used.
LogFC Threshold The log. fold change threshold for genes to be considered.
Value Threshold for Filtering by Statistics Select the statistical threshold for the selected statistics.
Table 9: Parameters for Methylation Enrichment tab

Finally, since the enrichment is performed on results by the annotation process, changing the parameters used for annotation itself will affect the resulting enriched pathways.