Create Clustergram - hisl6802/ClusteringToolbox GitHub Wiki

Creating a clustergram

Clustering can provide critical insights into the metabolites or more generally data objects that consistently co-cluster or are strongly associated. This functionality aims to provide publication quality visuals for the wanted agglomerative hierarchical clustering routine.

Input excel sheet

mz S1 S2 S3 S4 S5 S6 S7 rtmed
G1 G1 G1 G1 G2 G2 G2
60.04 $I_{11}$ $I_{12}$ $I_{13}$ $I_{14}$ $I_{15}$ $I_{16}$ $I_{17}$ 3.21
61.04 $I_{21}$ $I_{22}$ $I_{23}$ $I_{24}$ $I_{25}$ $I_{26}$ $I_{27}$ 3.62
62.04 $I_{31}$ $I_{32}$ $I_{33}$ $I_{34}$ $I_{35}$ $I_{36}$ $I_{37}$ 3.62
69.99 $I_{41}$ $I_{42}$ $I_{43}$ $I_{44}$ $I_{45}$ $I_{46}$ $I_{47}$ 9.33

Available linkage functions

  • Ward
  • Complete
  • Average
  • Single

Available distance measures

  • All those listed in pdist from scipy (docs)

Correlation No-square root

$$ D(Y,Y') = 1 - \lvert r \rvert $$

where r is a correlation measure (e.g., pearson correlation)

Correlation square root

$$ D(Y,Y') = \sqrt{2(1 - \lvert r \rvert)} $$