Glossary - PrincetonUniversity/princeton-mvpa-toolbox GitHub Wiki
Princeton Multi-Voxel Pattern Analysis ' glossary
See Manual (Data structures section) for more information on terms relating to the way the data is stored by the toolbox.
block
A group of contiguous TR from the same condition in a particular run. Usually comprises multiple behavioral trials.
classification
In the machine learning sense, classification means taking a labelled training data set and showing the classifier algorithm examples of each condition over and over until it can successfully identify the training data. Then, the classifier's generalization performance is tested by asking it to guess the conditions of new, unseen data points.
See: Classification in the manual.
condition
The groups that you're trying to teach your classifier to distinguish, e.g. different tasks being performed by the subject in the experiment, or different stimuli being viewed.
cross-validation
When you use n-minus-one/leave-one-out cross-validation classification, you iterate over your data multiple times. Each iteration involves a fresh classifier trained on a subset of the data, and tested on the withheld data.
See: N-minus-one (leave-one-out) cross-validation
feature selection
Deciding which of your features (e.g. voxels) you want to include in your analysis.
generalization
Testing the performance of a trained classifier on previously-unseen (test) data
header
See: Data structure ' Book-keeping and the headers
history
A free-text field in the header that gets automatically appended to, creating a sort of narrative of that object's role in the analysis.
See Data structure ' Book-keeping and the headers
iteration
Running the classifier once, using a particular subset of the data for testing, and the remainder for training. For example, you have 10 runs, you'll have 10 iterations, each time withholding a different run as the testing data.
See: n minus one cross validation
leave-one-out
We use 'leave-one-out' and 'n-minus-one' interchangeably to refer to the cross-validation procedure that leaves out a different subsection (e.g. run) of the data each iteration.
mask
A boolean 3D (or maybe 2D) single-TR volume indicating which voxels are to be included.
name
Every object in the _subj_ structure has a name. This is a very important field, since it is used whenever accessing that object. The user is advised to refrain from accessing objects directly (e.g. subj.patterns{1}).
See: Data structure ' innards of the _subj_ structure and Advanced ' accessing _subj_ directly
n minus one cross validation
We use 'leave-one-out' and 'n-minus-one' interchangeably to refer to the cross-validation procedure that leaves out a different subsection (e.g. run) of the data each iteration.
object
An example of one of the 4 main data types, e.g. a single cell in subj.patterns or subj.masks. Contains a mat field with all the data, as well other required fields such as name, group_name, derived_from, header etc.
See: The innards of the subj structure
one-of-n
In this toolbox, this tends to refer a regressors matrix, to the idea that only a single condition can be active at any timepoint. This makes sense for basic/standard classification ' each timepoint belongs to one or other of the conditions, but not more than one at once.
Convolving regressors with a hemodynamic response function will lead to continuous-valued regressors, which may overlap (i.e. more than one condition may be non-zero at a given timepoint), which may violate some functions' one-of-n requirements.
Check_1ofn_regressors.m allows you to test whether a matrix is one-of-n.
pattern
A (features x timepoints) matrix, usually of voxel activities, but could also be PCA components, wavelet coefficients, GLM beta weights or a statmap.
See: Data structure ' patterns.
peeking
When you use your testing data set to help with voxel selection. Basically, this is a kind of cheating, and spuriously/illegitimately improves your classification by some margin.
See: Manual.
performance
The performance metric measures the similarity between the output produced by a classifier to the output it's supposed to produce.
See Performance in the Classification section of the manual.
Pre-Classification
By this, we mean the normalization and feature selection steps that go on before after the data structure has been created but before beginning classification, e.g. zscore_runs.m and feature_select.m.
See pre-classification.
regressors
For our purposes, the term 'regressors' refers to a set of values for each TR that denote the extent to which each condition is active. Used by statistical tests, and also as the teacher signal for the classifiers.
See: Data structure ' regressors.
results
This is where all the information about classification is stored.
See: Classification ' results structure.
run
A single scanning session. There are usually a handful of runs in a given hour-long experiment.
selector
A set of labels for each TR, e.g. where all the runs start and finish, or which TRs should be used for training and which for testing on this iteration.
See: Data structure ' selectors.
statmap
The result of some kind of statistical test, usually performed separately for each voxel. For instance, the ANOVA yields a statmap of p values, one for each voxel. Each p value denotes the probability that that voxel varies significantly between conditions.
Statmaps are stored as patterns, since the term 'mask' is usually used to refer to a boolean 3D volume.
A mask can be created from a statmap by choosing all the voxels that are above/below some threshold.
See Data structure ' masks and Pre-classification ' Statmaps.
subj
See: Data structure ' selectors.
testing
Presented a trained classifier with patterns that it has never seen before, and testing its performance.
TR
Stands for 'time to repetition'. Basically, the time taken for the scanner to acquire a single 3D brain volume. We often use it (somewhat imprecisely) to mean a single timepoint (usually of about 2s).
training
Showing a classifier lots of examples of a person's brain in condition A, and telling it each time, 'This is an example of the brain in condition A'. We then show it lots of examples of the same brain in condition B, also telling it which condition these brain examples came from. This process repeats until the classifier has learned which are which.
In reality, the examples tend to be interleaved with each other and presented in a different order each time. Most classifier algorithms can also deal with more than just two categories.
trial
A behavioural trial in the experiment, that probably spans multiple TRs. Multiple trials make up a block.
voxel selection
Whenever you apply a mask to a pattern, you are selecting voxels. This term tends to be used more often in the machine learning context of 'feature selection' ' choosing which of the features (voxels) contain signal for the classification problem you are attempting.
See: 'Pre-classification ' Anova' in the Manual.