2D STOCSYs - EBI-Metabolights/SAFERnmr GitHub Wiki

So why not just do a STOCSY on all spectral points? Then, we could apply a local or global cutoff and statistically associate all spectral points in a molecule with each other! Well, that’s been tried a few times, and this amounts to 2D STOCSY:

A small region (~650 points) of a 2D STOCSY (lower panel) corresponding to a spectral region with peaks (upper panel) containing citrate peaks. This one has been row-normalized; thus it won’t appear symmetrical about the diagonal. Lower right panel shows what the STOCSY result would be for each row, where the driver is indicated with a blue (down) arrow.

How do you interpret this? The diagonal contains the darkest points. It’s so thin here that it’s hard to see, but this is the correlation of each point with itself and equals 1 by definition. Each row is the STOCSY with the point on the diagonal. Look at the citrate peaks about ¼ way across from the left. It’s tempting to see the three diagonal lines as a triplet of some sort – but it is actually a doublet, where one resonance is the driver, then the other is the driver. Moving up the diagonal, we see a strong diagonal (indicating a larger resonance) and two strong lines which are half the length of the diagonal. These trace out the boundaries of the resonances in the citrate peak. Interestingly, halfway through the line on the diagonal, the peak flips. This indicates that the spectral point belonging to the diagonal is now the citrate resonance on the right. If the resonances are far enough away, the diagonal-off-diagnonal pair would be separated completely. What you see in the rightmost part of the matrix in those columns are less-stable, weaker correlations with peaks on the rightmost part of the spectral matrix there.

This is great, but it’s a big task to compute this 130K x 130K correlation matrix across n samples. Realistically, we only have to compute half of it, and the vast majority of the resulting matrix isn’t useful. However, suppose we only want to extract features from the same peak, not the entire spectral signal. For this, we only need a small window about the diagonal - a few hundred spectral points should be enough to capture most intra-peak resonances. We can stack these slices (NA’s from edges in white) in a similar matrix:

Spectral Stackplot

Sliding STOCSY (local)

This is easy to compute. But we still have to decide how features should be extracted from this matrix. Then a bunch of other questions come up: how far from the diagonal (identity) should you look for a molecular signature? Do those points all the way out on the right side of the matrix matter? How do you determine a satisfactory cutoff for all of these points? What do you do when a correlation peak should occur, because you can see the point, but the STOCSY peaks are just blobs because of misalignment or overlap?

Continue on to correlation peak extraction