Custom Analysis Part 6 - veeninglab/BactMAP GitHub Wiki
Now I will start the hierarchical clustering. I wrote a small function to run this automatically:
cluster_summary <- function(timedataframe, clustertype="COR"){
#create dissimilarity matrix using TSclust.
dif_COR <- TSclust::diss(as.matrix(timedataframe), clustertype)
#a bit of shuffeling (from dataframe to matrix and back to be able to plot a rasterplot)
forplot <- as.data.frame(as.matrix(dif_COR))
forplot$x <- rownames(forplot)
forplot <- tidyr::gather(forplot, y, value, -x)
forplot <- forplot[order(forplot$value),]
#a rasterplot comparing the different cell profiles
matrixplot <- ggplot2::ggplot(forplot, ggplot2::aes(x=x,y=y,fill=value)) + ggplot2::geom_raster() + ggplot2::scale_fill_viridis_c() + ggplot2::theme_minimal()
#and hierarchical clustering of the dissimilarity matrices
hierclust <- hclust(dif_COR)
return(list("dissMatrix" = dif_COR, "rasterPlot" = matrixplot, "hclust_result" = hierclust))
}
This function uses TSclust to create a dissimilarity matrix. This can
be then used to create a hierarchical clustering. There are different
methods for creating the matrix, check TSclust's extensive documentation
for more information. I start by using the correlation method.
valueclust <- cluster_summary(dataframe_bins, "COR")
plot(valueclust$hclust_result)
You can see that there are 2 very outstanding groups, after that a few groups split very rapidly. Because there is quite a “high-up” first split in the groups on the right side, I decide to start with clustering in 3 groups. For this I made another small function:
addGroups <- function(oriData, clusterData, breaks, DN){
C1 <- cutree(clusterData, breaks) #cut the clustered data
C1 <- as.data.frame(C1)
C1$celldiv <- rownames(C1)
colnames(C1) <- c(DN, "celldiv") #in my case the name of my variables is always "celldiv" but surely needs to be changed when exporting this function to other functionalities
oriData <- merge(oriData, C1) #add cluster group name to original data
return(oriData)
}
Now I can add group names to my datasets cells_mean
and cells_image
:
cells_mean <- addGroups(cells_mean, valueclust$hclust_result, 3, "COR")
cells_image_grouped <- merge(cells_image$rawdata_turned, cells_mean[,c("cell", "frame", "division", "COR")])
⬅️ Custom Analysis Part 5: Preparation for Clustering | Custom Analysis Part 7: Visualizing the Groups ➡️ |
---|