STOCSY - EBI-Metabolights/SAFERnmr GitHub Wiki

When trying to computationally identify signatures in complex data, the first approach is typically Statistical TOtal Correlation SpectroscopY (STOCSY). The idea here is to simulate a TOCSY experiment:


  • uses magnetization transfer to probe 1H peak interactions within a spin system
  • more intense cross-peaks represent stronger connections between protons
  • 2D experiment where each peak is probed for interaction with each other peak (axes are 1H, 1H)

[image of a TOCSY]


  • uses inherent concentration variations between samples in a dataset to probe intramolecular 1H peak interactions
  • peaks which are in the same molecule rise and fall together across samples (i.e. are highly correlated)
  • covariance intensity more faithfully represents actual peak shape than correlation; correlation tells you 'where', covariance tells you 'what'
  • can be 2D, like a TOCSY

[image of a 2D STOCSY, image of a selected row result]

A typical STOCSY involves selecting a driver peak (i.e. ppm value/column in the spectral matrix X), then computing the covariance and Pearson's correlation coefficient between that column and all other columns of X. This results in a vector for each statistic, which is visualized as a covariance pseudospectrum with correlation value colormap (redder peaks are more strongly correlated with the driver).

STOCSY warnings:

  • peak shifting affects result (lowers correlation and blurs peakshape)
  • samples not containing the compound degrade the result
  • while overlap on average can be overcome, it has an impact
  • full 2D STOCSYs can be costly to compute and store, and don't necessarily reveal useful patterns