Custom Analysis Part 6 - veeninglab/BactMAP GitHub Wiki

Hierarchical clustering with TSclust

Now I will start the hierarchical clustering. I wrote a small function to run this automatically:

cluster_summary <- function(timedataframe, clustertype="COR"){
  #create dissimilarity matrix using TSclust.
  dif_COR <- TSclust::diss(as.matrix(timedataframe), clustertype)
  #a bit of shuffeling (from dataframe to matrix and back to be able to plot a rasterplot)
  forplot <- as.data.frame(as.matrix(dif_COR))
  forplot$x <- rownames(forplot)
  forplot <- tidyr::gather(forplot, y, value, -x)
  forplot <- forplot[order(forplot$value),]
  #a rasterplot comparing the different cell profiles
  matrixplot <- ggplot2::ggplot(forplot, ggplot2::aes(x=x,y=y,fill=value)) + ggplot2::geom_raster() + ggplot2::scale_fill_viridis_c() + ggplot2::theme_minimal()
  #and hierarchical clustering of the dissimilarity matrices
  hierclust <- hclust(dif_COR)
  return(list("dissMatrix" = dif_COR, "rasterPlot" = matrixplot, "hclust_result" = hierclust))
}

This function uses TSclust to create a dissimilarity matrix. This can be then used to create a hierarchical clustering. There are different methods for creating the matrix, check TSclust's extensive documentation for more information. I start by using the correlation method.

valueclust <- cluster_summary(dataframe_bins, "COR")

plot(valueclust$hclust_result)

You can see that there are 2 very outstanding groups, after that a few groups split very rapidly. Because there is quite a “high-up” first split in the groups on the right side, I decide to start with clustering in 3 groups. For this I made another small function:

addGroups <- function(oriData, clusterData, breaks, DN){
  C1 <- cutree(clusterData, breaks) #cut the clustered data
  C1 <- as.data.frame(C1)
  C1$celldiv <- rownames(C1)
  colnames(C1) <- c(DN, "celldiv") #in my case the name of my variables is always "celldiv" but surely needs to be changed when exporting this function to other functionalities
  oriData <- merge(oriData, C1) #add cluster group name to original data
  return(oriData)
}

Now I can add group names to my datasets cells_mean and cells_image:

cells_mean <- addGroups(cells_mean, valueclust$hclust_result, 3, "COR")
cells_image_grouped <- merge(cells_image$rawdata_turned, cells_mean[,c("cell", "frame", "division", "COR")])

⬅️ Custom Analysis Part 5: Preparation for Clustering Custom Analysis Part 7: Visualizing the Groups ➡️
⚠️ **GitHub.com Fallback** ⚠️