Organizing a project folder with BIDS - GlascherLab/LabWiki GitHub Wiki

Your project folder should adhere to the Brain Imaging Data Structure (BIDS) format. This is a standardized way to organize files and folder names and create specific metadata in a machine readable format (mostly JSON files). You can familiarize yourself with the BID format using this link.

The BIDS folder tree

The BIDS folder tree specifies certain folders and their contents. Here is one example for the top-level folders (and the description of their contents) from an fMRI study (matchpennies)

.
├── code                               project-specific script/programs etc.
├── containers                         docker/singularity container files (e.g. fmriprep)
├── derivatives                        preprocessed data, modeling results, statistical analysis
├── doc                                general documentation for the project
├── plots                              data plots and results figures
├── sourcedata                         the original datafiles (e.g. DICOM, BDF etc.)
├── sub-01                             raw (unprocessed) data files for each subject
├── <sub-02 until sub-43> omitted
└── task                               all files for running the experiental task

The subject-specific folders (e.g. sub-01) has a subfolder for each data modality:

sub-01
├── anat                               anatomical image (structural T1)
├── beh                                behavioral data
├── eeg                                EEG data
├── eye                                eye-tracking data (non-standard name)
├── loc                                electrode location coordinates (non-standard name)
├── fmap                               field map (phase/magn images)
└── func                               functional MRI data (EPI)

If there are multiple data collection sessions (e.g. on separate days), then you can insert a ses-01 etc. level under the subject-level.

The derivatives contains all derived data and analyses (e.g. behavioral modeling, image preprocessing, statistical analyses etc.)

derivatives/
├── analysis                           statistical analyses
│   ├── first_level                    subject-specific analyses
│   │   ├── sub-01
│   │   │   └── model_onset            name of analysis
│   │   ├── sub-02 (omitted)
│   └── second_level                   group-wise analyses
│       └── model_onset                same name as in first_level
├── modeling                           computational modeling
│   ├── ActiveInference                modeling approach 1
│   ├── kToM                           modeling approach 2
│   └── stan
└── preprocessed                       preprocessed neuroimaging data (parallel pipelines at this level)
    ├── sub-01                         for each subject (this are the input to the statistical analyses
    │   ├── anat                       preprocessed T1
    │   ├── fmap                       preprocessed Fiedmap data (and derived images (e.g. VDM)
    │   └── func                       preprocessed EPI
    ├── sub-02 (omitted)

The code folder contains general scripts at the top level and subfolders for specific scripts for each part of the derivatives.

code                    general scripts (e.g. in R or Matlab), further subfolders corresponding to subfolder in derivatives
├── R                   e.g. all scripts in R
├── first_level         scripts for 1st level analyses (e.g. SPM batches)
├── preproc             scripts for preprocessing (different pipelines in different folders)
└── second_level        scripts for 2nd level analyses

The doc folder is for all kinds of documentation about the project. Useful subfolders are:

doc
├── grant               grant proposal for the project (no read permission to "other", e.g. chmod o-rwx
├── ethics              ethics proposal for the project (no read permission to "other", e.g. chmod o-rwx
├── papers              useful papers
└── summary             results summaries and presentations

For more information, please consult the BIDS documentation.

The BIDS file name convention

According to the BIDS specification each file and position in the BID tree should be identifable by its filename. This seems a bit over the top for me (because the location of the file details its provenance), but in some cases I have seen its value, especiall when rearranging parts of the BIDS tree. Although it makes scripting file names a bit awkward, please ato adhere to the BIDS file naming contention.

The general template for a file name is:

sub-XXX_task-XXX_run-XXX_<name>-<value>_<modality>.<suffix>  (e.g. sub-01_task-sft_run-01_eeg.xdf)

The - component is spearated with underscores (_) from other components. Each filename starts with the subject (e.g sub-01) and should contain the task (e.g. task-sft) and the run number (e.g. run-01) and the final modality is separated with another underscore (e.g. _eeg).

More information can be found in the BID specification

BIDS metadata

There are two type of metadata files: JSON files (.json) with a special format for name-value combinations and tab-separated value files (.tsv), which are tables of data with subjects/trials/events as rows and variables as columns separate by a tab for better readability by humans. In addition, there is a unstructures README.MD file at the top level of the project folder (with Markdown formatting) explaining the project in general terms. This is the first entry into the dataset for a researcher unfamiliar with the project (e.g. when the data are published in and public repo)

These metadata contain important information that are also useful when writing up the paper (e.g. information about imaging parmaeters), so even collecting this information and creating a corresponding JSON file can appear as a waste of time, it will beocme in very useful later on. So, please take care and enter the information in the metadata files early and as you go through your project. Then the time investment is limited and you (and possible others) get to benefit for this effort later on.

Here is a list of some of the required BIDS metadata files. Most of these are located in the top-level projects bolder (aka $BIDSROOT)

  • $BIDSROOT/dataset_description.json: a specific JSON file with general information about the project (e.g. title, authors, acknowledgments, grant number etc.)
  • $BIDSROOT/participants.tsv: a table with subject-specific information (e.g. demographic information, experimental condition, subject ID of partner in social interaction experiments)
  • $BIDSROOT/participants.json: an accompanying side-car with a longer description of the variables (columns) in participants.tsv
  • _events.tsv: a table with experimental events (e.g. trials in rows) and variables (e.g. CUE, CHOICE, OUTCOME) in columns. The table contains onsets for fMRI/EEG and accomapnies every data file in the func/eeg subfoler in the subject folder (e.g. sub-01/func/sub-01_task-sft_run-01_events.tsv)
  • $BIDSROOT/events.json: a JSON side-car files with longer description of the events in an sub-XX..._events.tsv file.
  • _beh.tsv: table with behavioral data for each run in the experiment with trials as rows and variables as columns (e.g. stimulsu configuration, choices, outcomes, RTs etc.). These files reside in subject-specific folders sub-XX/beh. For convenience, I usually also keep a binary .mat (or in some other analysis software, e.g. .Rdata) in the same folder. But the .tsv should be master reference file, which is created from the original logfiles of the presentation softward (e.g. PTB)
  • $BIDSROOT/beh.json: a side-car JSON file with long description (and possible values) of the variables in the _beh.tsv files.
  • $BIDSROOT/{anat,epi,fmap,eeg}.json etc: JSON fiels with parameters for each imaging modality, very useful for the Methods section of the paper.

NOTE: events.json and beh.json are really important for understanding the (behavioral) data in the experiment. Please make sure tha the description in these JSON files is accurate and informative!

More information and template JSON files can be found in the BIDS documentation.

For an example of these files, please consult the files in the matchpennies project (fMRI) and tiger project (EEG hyperscanning). They are both on dendrite in /projects/crunchie/glaescher. The matchpennies BIDS project is almost complete (including the README.md file), the tiger project is still in progress (especially the metadata files)

⚠️ **GitHub.com Fallback** ⚠️