Summary: weekly mtg 20160620 (Byron, Steve, Matt, me) - mobeets/nullSpaceControl GitHub Wiki

Prep

Main changes

Multiple things have changed since the SfN abstract:

using IME
no longer correcting bounds based on factor values
cloud hypothesis now uses closest point, rather than sampling from set of near points
scoring with thetaActuals, not thetas

The below plots are normalized to the highest and lowest scores for ease of comparison in terms of these things changing the orderings of hypotheses.

(1, 2, 3): Using IME, No more bounds correction, and New cloud hyp (aka cloud-1, or cloud)

blueyime_lightwbnds

First off note that light and dark colors represent with or without bounds correction. These have no real effect on the orderings of scores.
Next, red is noIME, blue is yesIME. Adding IME seems to improve cloud relative to others the most, and the others very little.
Finally, note the slope between cloud-1 and cloud-og to see how with IME, cloud-1 is much better than cloud.

4. Scoring with thetaActuals, not thetas

Now lights are thetas, darks are thetaActuals.

so light-red is Sfn abstract results, with dark-blue being current results. red = noIME, blue = yesIme

Note that without IME (red), changing to thetaActuals greatly improves cloud hyps relative to others, so that for 20120709, and both 2013 dates, cloud is now one of the best. for 20120601 it is as good as habitual, and for 20120525 it's beat by hab and prune but it's as good as mean-shifts.

blueyime_lightthetas

Summary of Sfn vs now:

to summarize Sfn (red) vs. current (blue),

normalized: screen shot 2016-06-20 at 11 34 59 am

raw: screen shot 2016-06-20 at 11 35 49 am

or, just sfn (light red) vs. changing to thetaActuals (dark red): screen shot 2016-06-20 at 11 37 59 am

Fitting 2d KDEs

tl;dr: null space activity is low-dimensional, and errors in joint KDEs (fit to 2d PCA projection matches) for the most part match the overall shape of mean error.

Null space is 2- or 3-d.

screen shot 2016-06-15 at 1 49 35 pm

screen shot 2016-06-20 at 11 01 08 am

all dates agree in mean and kde error essentially exactly! so using mean is a good proxy for fitting the entire distribution, which is cool.

Still to do:

what is the striping below?
still need to look for visual agreement in the kdes (e.g., you can tell unconstrained is different, but what about the others?)

screen shot 2016-06-15 at 5 23 35 pm

Misc.

results appear to be mostly consistent across null space columns

Summary

I start about by explaining the ways things have changed since the abstract: Cloud becomes one of the best just by scoring appropriately (i.e., binning by thetaActuals and not by thetas), OR by using IME (in which case cloud-1 is better than cloud-200).

Steve is initially concerned that 'cloud-1' needs a new interpretation, but we didn't settle on anything--seems similar to cloud-200 just with a smaller sample size.

We discuss IME, whether we need to cross-validate. But IME isn't really changing the orderings much anyway so probably not.

We talk about how interesting it is that adding IME seems to make cloud-1 so much better, but it doesn't really affect the other hyps. Just by rotating the space...?

We go through cartoons of hyps just to refresh.

Finally, I ask if there are any blockers to beginning to write a paper. Byron says we should start writing; specifically, let's brainstorm a storyboard of figures to tell the story. Need to explore how to explain this project and results from first principles. Keep mechanism discussion in the Discussion. Potentially submit to Nature Neuro or Neuron.

Steve gone for next two weeks. Byron out next Wednesday. Email update by Wed next week.

To do

IME:

fit 2-fold, treat as 2 different exps (i.e., fit on one half, predict other half, and vice versa)
scramble in null space, re-run IME (only dog-ear--don't actually do this)

plot marginals

show hists; use minimal smoothing
plots of best day for each monkey, 8 bins, 8 cols
one of data and best two hyps
one of data and energy hyp

look into places where hyps make different predictions

does one do consistently better than the other?
or just focus on how and why they're different in these places