Adding Sources to COVID19_cases_deaths_data.Rmd - namedentities/NESScovid19 GitHub Wiki
To add a source we ultimately want long file where each row is a place-date and the columns are named "confirmed", "deaths", "people_test", "tested_people", and "tested_samples" and represent the cumulative values.
Run the first two chunks to load the rex_clean function, the admin0, admin1, admin2 datasets, and the needed packages.
Read in the relevant data file as openXX
where "XX" is the ISO31661 country code.
Get it into date-location format and get the locations into ISO31661 codes. Then create the admin columns and apply the rex_admin_function. The following chunk of code can be used as a template. Link for date formats.
openXX_long <- openXX %>%
# code to get into long format %>%
mutate(date_asdate = ymd(date)) %>% # mutate(date_asdate = [dateformat]([datecolumn]))
mutate(admin0_name_original=[column/string/blank]) %>%
mutate(admin1_name_original=[column/string/blank]) %>%
mutate(admin2_name_original=[column/string/blank]) %>%
mutate(admin0_name_clean=admin0_name_original %>% rex_clean()) %>%
mutate(admin1_name_clean=admin1_name_original %>% rex_clean()) %>%
mutate(admin2_name_clean=admin2_name_original %>% rex_clean()) %>%
rex_admin_function()
Example countries depending on format or issue:
- each state has its own column or multiple columns: Mexico
Create a joining file forjoining_openXX
. Create a dataset column and select the necessary columns (dataset, gid, geonameid, wikidata_id, date_asdate, confirmed, deaths, people_test, tested_people, tested_samples).