# STOCSY - EBI-Metabolights/SAFERnmr GitHub Wiki

When trying to computationally identify signatures in complex data, the first approach is typically Statistical TOtal Correlation SpectroscopY (STOCSY). The idea here is to simulate a TOCSY experiment:

TOCSY

- uses magnetization transfer to probe 1H peak interactions within a spin system
- more intense cross-peaks represent stronger connections between protons
- 2D experiment where each peak is probed for interaction with each other peak (axes are 1H, 1H)

[image of a TOCSY]

STOCSY

- uses inherent concentration variations between samples in a dataset to probe intramolecular 1H peak interactions
- peaks which are in the same molecule rise and fall together across samples (i.e. are highly correlated)
- covariance intensity more faithfully represents actual peak shape than correlation; correlation tells you 'where', covariance tells you 'what'
- can be 2D, like a TOCSY

[image of a 2D STOCSY, image of a selected row result]

A typical STOCSY involves selecting a driver peak (i.e. ppm value/column in the spectral matrix X), then computing the covariance and Pearson's correlation coefficient between that column and all other columns of X. This results in a vector for each statistic, which is visualized as a covariance pseudospectrum with correlation value colormap (redder peaks are more strongly correlated with the driver).

STOCSY warnings:

- peak shifting affects result (lowers correlation and blurs peakshape)
- samples not containing the compound degrade the result
- while overlap on average can be overcome, it has an impact
- full 2D STOCSYs can be costly to compute and store, and don't necessarily reveal useful patterns