Visualizing co‐expressed gene modules in contrasting genotypes - labbces/sugarcane_RNAome GitHub Wiki

An interesting way to visualize the expression of coding and non-coding RNAs is by plotting the heatmap of co-expressed gene modules or clusters. I developed these two R scripts to plot the heatmap of all co-expressed modules and to plot a single heatmap showing the average expression of all modules.

The co-expression heatmaps of gene modules from the Hoang2017, Correr2020, and Perlo2022 datasets generated in this study were plotted, resulting in 134, 34, and 6590 modules, respectively. Due to the large number of modules, several approaches were tested to identify modules with interesting expression patterns, which are detailed below.

Identifying co-expressed genes in relevant modules

The correlation between the first principal component (PC1) of the modules and the experimental conditions of each dataset was assessed using this script. Experimental conditions were converted into categorical variables, and the Spearman correlation between these conditions and the PC1 of each module was calculated. Overall, correlations were evaluated for module expression with genotypes (independent of sample type), with the experimental conditions specific to each dataset (e.g., internode part, sampling time, sugar accumulation, and fiber accumulation), and with genotype-condition interactions (e.g., genotype * internode).

Only correlations greater than 0.6 (rho > 0.6) and statistically significant (p-value adjusted using the Bonferroni method) were selected for further investigation into the gene expression within these modules:

filtered_genotype <- mutate(genotype_table, adjusted = p.adjust(genotype_table$pvalue, method = "bonferroni")) %>% mutate(genotype_table, fdr = p.adjust(genotype_table$pvalue, method = "fdr")) %>% filter(adjusted  < 0.05 & abs(rho) > 0.6)

filtered_week <- mutate(week_table, adjusted = p.adjust(week_table$pvalue, method = "bonferroni")) %>% mutate(week_table, fdr = p.adjust(week_table$pvalue, method = "fdr")) %>% filter(adjusted  < 0.05 & abs(rho) > 0.6)

filtered_internode <- mutate(internode_table, adjusted = p.adjust(internode_table$pvalue, method = "bonferroni")) %>% mutate(internode_table, fdr = p.adjust(internode_table$pvalue, method = "fdr")) %>% filter(adjusted  < 0.05 & abs(rho) > 0.6)

filtered_interaction <- mutate(interaction_table, adjusted = p.adjust(interaction_table$pvalue, method = "bonferroni")) %>% mutate(interaction_table, fdr = p.adjust(interaction_table$pvalue, method = "fdr")) %>% filter(adjusted  < 0.05 & abs(rho) > 0.6)

Exploring interesting modules

The Perlo2022 dataset resulted in 2, 1, and 13 co-expressed modules associated with genotypes, internode part, and sampling time, respectively.

The Hoang2017 and Correr2020 datasets did not show modules significantly associated with the experimental conditions.

A likely explanation for this is that non-coding RNAs exhibit highly variable expression. Additionally, during matrix filtering by gene expression coefficient of variation, only genes with high expression levels were selected for co-expression analysis, which may have resulted in the loss of expression clustering with the experimental conditions. Nevertheless, it was still possible to identify modules with both coding and non-coding genes showing strong correlated expression, making them candidates for functional exploration.

Some of the interesting co-expressed modules from the Perlo dataset are presented below:

Module 1029 - Associated with internode sampling time