Home - norahollenstein/cognitiveNLP-dataCollection GitHub Wiki

Collection of openly available data sources for cognitively-inspired NLP

This growing collection of cognitive, neurolinguistic and behavioral data sources of human language processing in any language aims to provide a list of datasets useful for natural language processing researchers interested in cognitively-inspired and multi-modal NLP. Only datasets that are openly and freely available to the research community are included.

This collection includes data of cognitive processing during language understanding (i.e., reading text or listening to auditory stimuli). The datasets are sorted by the languages of the presented stimuli.

Data sets for language production/speech synthesis are not (yet) included in this repo.

If you have any questions or suggestions to expand this collection, just open a new issue or contact me directly: [email protected]

This list contains datasets of the following recording modalities:
Electroencephalography (EEG)
Eye-Tracking
Functional magnetic resonance imaging (fMRI)
Keystroke metrics
Magnetoencephalography (MEG)
Mouse movements
Self-paced reading