UK Biobank fMRIPrep outputs - neurohub/neurohub_documentation GitHub Wiki

Preprocessing of the nearly 40k subjects has been completed using CBRAIN and the outputs have been packaged for user-level access.

Some background

The outputs of the fMRIPrep pipeline run over 37,732 subjects from the UK Biobank dataset with neuroimaging data have been generated and are available on both the CBRAIN portal and directly on Beluga.

The dataset is voluminous. It contains 16,935,185 files and uses 117 terabytes of disk space. It took nearly 5 months for the Alliance supercomputers to produce the files, and from the perspective of these computers, over 103 years of computing time. During processing, over two billion intermediate files were produced.

Because of the huge volume of data, we kindly ask our community to not make full copies of these outputs.

Please note the fMRIPrep task parameters include the following output spaces:

  • "anat"
  • "fsaverage"
  • "MNI152NLin6Asym"
  • "MNI152NLin2009cAsym:res-2"

The fMRIPrep preprocessed data for all participants with imaging data are available to access through CBRAIN and Beluga.

Prerequisite: UK Biobank granted access is required to access the output data.

1. Access the fMRIPrep outputs in CBRAIN

To access the outputs in CBRAIN, you will have to be a member of the project NeuroHub UK Biobank.

  • Go to the project tab and search for NeuroHub-UK-Biobank project.

image

  • Select the File Type fMRIPrep Output

The 37,732 fMRIPrep outputs will be displayed:

image

Tips:

If you are interested in extracting only specific files from the fMRIPrep outputs, consider running SimpleFileExtractor in CBRAIN. As an example, you are looking into 20 fMRIPrep Ouputs but only need thickness and surface files, then you can select those 20 outputs and launch SimpleFileExtractor directly in CBRAIN.

2. Access the fMRIPrep outputs in Beluga

The dataset is available on Beluga to UK Biobank users at the following path:

/project/6008063/neurohub/ukbb/derivatives/fmriprep

How to access the fMRIPrep output data files

As advised in the Accessing Imaging Data section of our documentation, the imaging data is made available through a series of SquashFS files on the Beluga system of Alliance. These SquashFS files are filesystems and need to be mounted via an Apptainer container.

More information about SquashFS through Apptainer can be found here.

Once you are in the fmriprep directory, please note each SquashFS file contains about 200 subjects. To access the files, kindly follow the steps:

  1. Load the Apptainer module on Beluga: module load apptainer

  2. Mounting a SquashFS file

    • Use the --overlay to mount files. The following will allow you to have a look at one file:

    $ apptainer shell --overlay fmriprep_000_1xxxxx1-1xxxxx6.sqfs sing_squashfs.sif

    • To get multiple squashfs files you can overlay the files:

    $ apptainer shell --overlay fmriprep_000_1xxxxx1-1xxxxx6.sqfs --overlay fmriprep_039_2xxxxx2-2xxxxx1.sqfs sing_squashfs.sif

  3. You can now look at the subjects, available under the path /neurohub/ukbb/derivatives/fMRIPrep

  4. Type exit to quit the container.

Note:

More information on how to access the files is provided directly under fmriprep folder in the fmriprep_README.txt file.

⚠️ **GitHub.com Fallback** ⚠️