Recode - Cghlewis/data-wrangling-functions GitHub Wiki
These are functions used for recoding values. Recoding often happens in situations such as:
- Research scales need to be reverse coded for scoring
- NA values need to be recoded
- A value was coded wrong in a survey and needs to be recoded
- A character variable needs to be recoded into a numeric variable
- Survey participants were allowed to write in responses to a question and similar responses need to be collapsed into the same categories in order to standardize your variable
- Several response categories need to be collapsed into a smaller number of categories
- A multiple choice question needs to be scored to correct or incorrect
Recode values
- Recode values for one variable
- Recode factors
- Recode dates
- Recode same values for multiple variables
- Reverse code values
- Recode NA
Recode using a data dictionary
- Recode using a long formatted personal data dictionary
- Recode using a wide formatted personal data dictionary
- Recode using a packaged dictionary
Collapse categories
Main functions used in examples
| Package | Functions |
|---|---|
| dplyr | recode(); recode_factor(); case_when(); na_if(); if_else() |
| tidyr | replace_na() |
| gendercoder | recode_gender() |
| forcats | fct_recode() |
Other functions used in examples
| Package | Functions |
|---|---|
| dplyr | mutate(); across(); if_any(); if_all(); rename_with(); tolower(); filter(); select(); pull() |
| base | as.numeric(); levels(); paste0(); table(); is.numeric() |
| janitor | tabyl() |
| stringr | str_detect(); regex() |
| labelled | val_labels() |
| tidyselect | everything(); all_of() |
| purrr | map() |
| tibble | deframe() |
Resources
- https://datascienceineducation.com/c07.html
- https://www.rdocumentation.org/packages/dplyr/versions/0.7.8/topics/case_when
- http://larmarange.github.io/labelled/reference/recode.haven_labelled.html
- https://stackoverflow.com/questions/68984415/what-does-operator-mean-in-r
- https://tim-tiefenbach.de/post/2023-recode-columns/