Keystroke metrics - norahollenstein/cognitiveNLP-dataCollection GitHub Wiki
Keystroke datasets for NLP
Keystroke data are behavioral biometrics recorded during text generation and have been extensively used in psycholinguistic and writing research to gain insights into cognitive processing. Keystroke dynamics represent the user's typing patterns and correlate with eye movements.
This collection contains datasets in the following languages:
English
Observations from HALIE
A dataset of human users interacting with LMs to solve information-seeking tasks
Stimulus: Answering multiple choice questions and solving crossword puzzles
Participants: 189 crowd workers for social dialogue, 304 for crossword solving, and 342 for question answering
Data: https://github.com/stanford-crfm/halie
Reference: Lee et al. 2023
CoAuthor Dataset
A human-AI collaborative dataset that captures interactions writers and GPT-3 language model instances.
Stimulus: keystrokes from writing creative and argumentative texts based on prompts from GPT-3 language models.
Participants: 63
Data: https://coauthor.stanford.edu/
Reference: Lee et al. 2022
Scrolling Interactions to Predict Readability
Stimulus: advanced and elementary texts from the OneStopEnglish corpus
Participants: 598 participants (native speakers and English L2 speakers)
Data: https://github.com/siangooding/readability_scroll
Reference: Gooding et al. 2021
University of Buffalo Keystroke Dynamics
Stimulus: keystrokes and mouse coordination based on transcription as well as free text typing
Participants: 157
Data: https://www.buffalo.edu/cubs/research/datasets.html#title_429116244
Reference: Sun et al. 2016
Clarkson University Keystroke Dataset
Stimulus: password input, free text questions and transcription
Participants: 39
Data: https://citer.clarkson.edu/research-resources/biometric-dataset-collections-2/clarkson-university-keystroke-dataset/
Reference: Vural et al. 2014
Stewart Keystroke and Stylometry Dataset
Stimulus: Contains free-text input of 966 words per subject (on average)
Participants: 40
Data: https://bitbucket.org/biometrics/dataset-stewart-keystroke/wiki/browse/
Reference: Stewart et al. 2011
Romanian
Politehnica University Timisoara Keystroke Dataset
Stimulus: free text
Participants: 80 participants
Data: https://sites.google.com/view/cataliniapa/timisoara-kd-data-set
Reference: Iapa & Cretu (2021)