Functional magnetic resonance imaging (fMRI) - norahollenstein/cognitiveNLP-dataCollection GitHub Wiki

fMRI datasets for NLP

Functional magnetic resonance imaging, or fMRI, is a technique for measuring brain activity. It detects the changes in blood oxygenation and flow that occur in response to neural activity: When a brain region is more active it consumes more oxygen, and to meet this increased demand blood flow increases to the active area. fMRI can be used to produce activation maps of the brain regions involved in a particular mental process.

This list contains datasets in the following languages:

Chinese
Dutch
English
Multilingual

Chinese

SMN4Lang

Stimulus: 6 hours of naturalistic stories
Participants: 12
Data: https://openneuro.org/datasets/ds004078/versions/1.2.1
Reference: Wang et al. (2022)

This dataset also includes an MEG part.

Stories

Stimulus: Text - reading short stories
Subjects: 30
Data: Available upon request from authors
Reference: Dehghani et al. (2017)

This dataset also includes an English part.

Dutch

Mother Of Unification Studies (MOUS) dataset

Stimulus: 360 well-formed and 360 scrambled sentences, visual and auditory stimulus
Subjects: 204
Features: Data available in Brain Imaging Data Structure (BIDS) format
Data: https://data.donders.ru.nl/collections/di/dccn/DSC_3011020.09_236?0
Reference: Schoffelen et al. (2019)

This dataset also contains MEG recordings.

Narrative Brain Dataset

Stimulus: Speech - listening to 3 stories
Data: https://osf.io/utpdy/
Reference: Lopopolo et al. (2018)

English

Narratives

Stimulus: Listening to stories spanning a variety of media, including commercially-produced radio and internet broadcasts, authors and actors reading written works, professional storytellers performing in front of live audiences, and subjects verbally recalling previous events.
Subjects: 345 adults
Data: https://datasets.datalad.org/?dir=/labs/hasson/narratives
Reference: Nastase et al. (2020)

Natural Stories Corpus

Stimulus: 10 stories (English texts edited to contain many low-frequency syntactic constructions)
Subjects: 78
Data: https://osf.io/eyp8q/
Reference: Shain et al. (2019)

Self-paced reading data of the same stimuli is also available.

Mixed Encoding

Stimulus: Mixed - sentences, word clouds and images
Subjects: 17
Data: https://evlab.mit.edu/sites/default/files/documents/index.html
Reference: Pereira et al. (2018)

Stories

Stimulus: Text - reading short stories
Subjects: 30
Data: Available upon request from authors
Reference: Dehgani et al. (2017)

This dataset also includes a Chinese part.

Semantic Maps

Stimulus: 10 autobiographical stories (each 10-15 mins)
Subjects: 7
Data: https://github.com/HuthLab/speechmodeltutorial
Reference: Huth et al. (2016)

Alice

Stimulus: Audio - listening to Alice in Wonderland (Chapter 1)
Subjects: 29
Data: https://sites.lsa.umich.edu/cnllab/2016/06/11/data-sharing-fmri-timecourses-story-listening/
Reference: Brennan et al. (2016)

Pseudo(words)

Stimulus: Text - words and pseudowords
Subjects: 36
Data: available upon request from Sarah Schuster
Reference: Schuster et al. (2015)

Harry Potter

Stimulus: Text - Reading of chapter 9 of Harry Potter
Subjects: 8
Data: http://www.cs.cmu.edu/~fmri/plosone/
Reference: Wehbe et al. (2014)

Nouns

Stimulus: words/images of 60 nouns
Subjects: 9
Data: http://www.cs.cmu.edu/afs/cs/project/theo-73/www/science2008/data.html
Reference: Mitchell et al. (2008)

Multilingual

Le Petit Prince Dataset

Stimulus: Le Petit Prince audiobook
Subjects: 49 English participants, 35 Chinese participants, 28 French participants
Data: https://openneuro.org/datasets/ds003643/versions/1.0.4
Reference: Li et al. (2021)