Create Clustergram - hisl6802/ClusteringToolbox GitHub Wiki
Creating a clustergram
Clustering can provide critical insights into the metabolites or more generally data objects that consistently co-cluster or are strongly associated. This functionality aims to provide publication quality visuals for the wanted agglomerative hierarchical clustering routine.
Input excel sheet
| mz | S1 | S2 | S3 | S4 | S5 | S6 | S7 | rtmed |
|---|---|---|---|---|---|---|---|---|
| G1 | G1 | G1 | G1 | G2 | G2 | G2 | ||
| 60.04 | $I_{11}$ | $I_{12}$ | $I_{13}$ | $I_{14}$ | $I_{15}$ | $I_{16}$ | $I_{17}$ | 3.21 |
| 61.04 | $I_{21}$ | $I_{22}$ | $I_{23}$ | $I_{24}$ | $I_{25}$ | $I_{26}$ | $I_{27}$ | 3.62 |
| 62.04 | $I_{31}$ | $I_{32}$ | $I_{33}$ | $I_{34}$ | $I_{35}$ | $I_{36}$ | $I_{37}$ | 3.62 |
| 69.99 | $I_{41}$ | $I_{42}$ | $I_{43}$ | $I_{44}$ | $I_{45}$ | $I_{46}$ | $I_{47}$ | 9.33 |
Available linkage functions
- Ward
- Complete
- Average
- Single
Available distance measures
- All those listed in pdist from scipy (docs)
Correlation No-square root
$$ D(Y,Y') = 1 - \lvert r \rvert $$
where r is a correlation measure (e.g., pearson correlation)
Correlation square root
$$ D(Y,Y') = \sqrt{2(1 - \lvert r \rvert)} $$