Config File Options - HopkinsIDD/COVID19_Minimal GitHub Wiki

We recommend reviewing the Getting Started pages before reading this page on the detailed configuration file options. For an example of a full configuration file, see config.yml in this repository or the Supplementary Material of our preprint.

Overview

The model configuration file config.yml controls all of the options currently available. This file has a tabbed outline structure. We will refer to keys using their full position in the outline. For example, we denote

spatial_setup:
  ...
  geodata: minimal

as spatial_setup::geodata having a value of minimal

Description of Types/Formats

date is [year]-[month]-[day]. (e.g., 2020-01-31)
boolean is either "TRUE" or "FALSE"
probability is a float between 0 and 1
distribution is the following config structure:

Item	Required?	Type/Format
distribution	required	"fixed" or "uniform"
value	required for "fixed"	fraction or probability
low	required for "uniform"	fraction or probability
high	required for "uniform"	fraction or probability

Sections

Global Header

These global configuration options typically sit at the top of the configuration file.

Item	Required?	Type/Format	Description
file_is_unedited	required to not be there	Remove it!
name	required	string	typically named after the region/location you are modeling
start_date	required	date	model simulation start date
end_date	required	date	model simulation end date
nsimulations	required	int	number of simulations to run
dt	required	float	simulation time step in days
dynfilter_path	optional	path to file	path to filtering text file
report_location_name	optional	string

name: Hawaii 
start_date: 2020-01-31 
end_date: 2020-12-31 
nsimulations: 1000
dt: 0.25
report_location_name: Hawaii

`spatial_setup` section

Config Item	Required?	Type/Format	Description
base_path	required	path to folder	base path for spatial files
setup_name	required	string	spatial folder name
geodata	required	path to file relative to `base_path`
mobility	required	path to file relative to base_path
popnodes	required	string	name of population column in geodata
nodenames	required	string	name of location nodes column in geodata
census_year	optional	integer (year)
modeled_states	optional	list of location codes	vector of locations that will be modeled
include_in_report	optional	boolean	name of boolean column in geodata
shapefile_name	optional	path to file relative to base_path
shapefile	optional	path to file relative to base_path; identical to shapefile_name
nonUS_mobility_setup	required for non-US locations	path to file relative to base_path
nonUS_pop_setup	required for non-US locations	path to file relative to base_path
geoid_params_file	required for non-US locations if running age-specific hospitalization adjustment	path to file with geoid-specific relative risks of health outcomes

spatial_setup:
  base_path: data/HI
  setup_name: HI
  geodata: geodata.csv
  mobility: mobility.txt
  popnodes: population
  nodenames: geoid
  include_in_report: include_in_report
  modeled_states:
    - HI
  census_year: 2010
  shapefile: shp/counties_2010_HI.shp
  shapefile_name: shp/counties_2010_HI.shp

`geodata` file

geodata is a .csv with column headers, with at least two columns: nodenames and popnodes.
nodenames is the name of a column in geodata that specifies the geo IDs of an area. This column must be unique.
popnodes is the name of a column in geodata that specifies the population of the nodenames column.
include_in_report is the name of an optional column in geodata that specifies which nodenames are included in the report. Models may include more locations than simply the location of interest.

Example geodata file format

geoid,population,include_in_report
10001,1000,TRUE
20002,2000,FALSE

`mobility` file

The mobility file is a .csv file (it has to contains .csv as extension) with long form comma separated values. Columns have to be named ori, dest, amount with amount being the amount of individual going from place ori to place dest. Unassigned relations are assumed to be zero. ori and dest should match exactly the nodenames column in geodata.csv

Example mobility file format

ori, dest, amount
10001, 20002, 3
20002, 10001, 3

It is also possible, but NOT RECOMMENDED to specify the mobility file as a .txt with space-separated values in the shape of a matrix. This matrix is symmetric and of size K x K, with K being the number of rows in geodata:

0 3
3 0

`importation` section (optional)

This section is optional. It is used by the covidImportation package to import global air importation data for seeding infections into the United States.

If you wish to include it, here are the options.

Config Item	Required?	Type/Format	Description
census_api_key	required	string	get an API key
travel_dispersion	required	number	how dispersed daily travel data is; default = 3.
maximum_destinations	required	integer	Number of airports to limit importation to
dest_type	required	categorical	location type
dest_country	required	string (Country)	ISO3 code for country of importation. Currently only USA is supported
aggregate_to	required	categorical	location type to aggregate to
cache_work	required	boolean	whether to save case data
update_case_data	required	boolean	deprecated; whether to update the case data or used saved
draw_travel_from_distribution	required	boolean	whether to add additional stochasticity to travel data; default is FALSE
print_progress	required	boolean	whether to print progress of importation model simulations
travelers_threshold	required	integer	include airports with at least the `travelers_threshold` mean daily number of travelers
airport_cluster_distance	required	numeric	cluster airports within `airport_cluster_distance` km
param_list	required	See section below	see below

`importation::param_list`

Config Item	Required?	Type/Format	Description
incub_mean_log	required	numeric	incubation period, log mean
incub_sd_log	required	numeric	incubation period, log standard deviation
inf_period_nohosp_mean	required	numeric	infectious period, non-hospitalized, mean
inf_period_nohosp_sd	required	numeric	infectious period, non-hospitalized, sd
inf_period_hosp_mean_log	required	numeric	infectious period, hospitalized, log-normal mean
inf_period_hosp_sd_log	required	numeric	infectious period, hospitalized, log-normal sd
p_report_source	required	numeric	reporting probability, Hubei and elsewhere
shift_incid_days	required	numeric	mean delay from infection to reporting of cases; default = -10
delta	required	numeric	days per estimations period

importation:
  census_api_key: "fakeapikey00000"
  travel_dispersion: 3
  maximum_destinations: Inf
  dest_type: state
  dest_county: USA
  aggregate_to: airport
  cache_work: TRUE
  update_case_data: TRUE
  draw_travel_from_distribution: FALSE
  print_progress: FALSE
  travelers_threshold: 10000
  airport_cluster_distance: 80
  param_list:
    incub_mean_log: log(5.89)
    incub_sd_log: log(1.74)
    inf_period_nohosp_mean: 15
    inf_period_nohosp_sd: 5
    inf_period_hosp_mean_log: 1.23
    inf_period_hosp_sd_log: 0.79
    p_report_source: [0.05, 0.25]
    shift_incid_days: -10
    delta: 1

`seeding` section

There are two different seeding methods: 1) based on air importation (FolderDraw) and 2) based on earliest identified cases (PoissonDistributed)

FolderDraw is required if the importation section is present and requires folder_path. Otherwise, put PoissonDistributed, which requires lambda_file.

Config Item	Required?	Type/Format	Description
method	required	"FolderDraw" or "PoissonDistributed"
folder_path	required for FolderDraw	path to folder
lambda_file	required for PoissonDistributed	path to file
delay_incidC	optional for PoissonDistributed	numeric	Assumption for number of days delay between infection and case confirmation for seeding with the PoissonDistributed method. Default is 5 days.
ratio_incidC	optional for PoissonDistributed	numeric	Assumption for ratio of infections to confirmed cases for seeding with the PoissonDistributed method. Default is 10 infections per confirmed case.
casedata_file	required for non-US locations	path to the data file from which the seeding setup file will be created

If using the importation section of the config and the air importation model:

seeding:
  method: FolderDraw
  folder_path: importation/HI/

or if seeding according to the earliest identified cases:

seeding:
  method: PoissonDistributed
  lambda_file: data/HI/seeding.csv
  delay_incidC: 5
  ratio_incidC: 10

`seir` section

Config Item	Required?	Type/Format	Description
parameters::alpha	optional	fraction	Transmission dampening parameter; Default is 1.0 and reasonable values for respiratory viruses range from 0.88-0.99
parameters::sigma	required	fraction or probability	Inverse of the incubation period in days
parameters::gamma	required	distribution	Inverse of the infectious period in days
parameters::R0s	required	distribution	Basic reproduction number

seir:
  parameters:
    alpha: 0.5
    sigma: 1 / 5.2
    gamma:
      distribution: uniform
      low: 1 / 6
      high: 1 / 2.6
    R0s:
      distribution: uniform
      low: 3.5
      high: 4

`interventions` section

This section lets you specify custom intervention scenarios.

scenarios specifies which settings to run. This does not need to include all items defined in settings.

Config Item	Required?	Type/Format
scripts_path	required	path name
scenarios	required	list of strings for scenario names
settings	required	See section below

`interventions::settings::[setting_name]`

Each string in scenarios should have a corresponding named setting in settings.

Right now, there are three types of templates: Reduce, ReduceR0 and Stacked. The ReduceR0 template is a special and redundant case of the Reduce template for the the r0 parameter. The Stacked template allows you to combine multiple interventions (Reduce or Stacked) together into a single intervention scenario.

Item	Required?	Type/Format	Description
template	required	"Reduce", "ReduceR0" or "Stacked"
parameter	required for Reduce	"alpha", "r0", "gamma", "sigma"	Specify the parameter associated with the intervention reduction (alpha = mixing coefficient, r0 = basic reproductive number, gamma = inverse of the infectious period, sigma = inverse of the incubation period
period_start_date	optional for Reduce, ReduceR0		date between global `start_date` and `end_date`; default is global `start_date`
period_end_date	optional for Reduce, ReduceR0		date between global `start_date` and `end_date`; default is global `end_date`
value	required for Reduce, ReduceR0	distribution
affected_geoids	optional for Reduce, ReduceR0		list of geoids, which must be in geodata
fatigue_rate::distribution	optional for Reduce, ReduceR0	distribution	Indicates the rate of intervention fatigue
fatigue_frequency_days	optional for Reduce, ReduceR0	numeric	Number of days it takes for the NPI to reach the new value
fatigue_min	optional for Reduce, ReduceR0	numeric	Minimum intervention effectivness for fatiguing interventions, clip values below this minimum
fatigue_type	optional for Reudce, Reduce R0	"geometric"	If specified, produce a geometric fatiguing rate instead of a linear fatiguing rate

interventions:
  scenarios:
    - None
    - Scenario1
    - Scenario2
  settings:
    None:
      template: ReduceR0
      period_start_date: 2020-04-01
      period_end_date: 2020-05-15
      value:
        distribution: fixed
        value: 0
    Wuhan:
      template: Reduce
      parameter: r0
      period_start_date: 2020-04-01
      period_end_date: 2020-05-15
      value:
        distribution: uniform
        low: .81
        high: .89
    UK:
      template: ReduceR0
      period_start_date: 2020-05-16
      period_end_date: 2020-05-31
      value:
        distribution: uniform
        low: .71
        high: .83
    Scenario2:
      template: Reduce      
      parameter: r0                       # Parameter to reduce
      period_start_date: 2020-02-01
      period_end_date: 2020-05-15
      value:
        distribution: uniform            # Value of reduction as it was specified before.
        low: .6
        high: .7
      fatigue_rate:
        distribution: uniform            # Value of fatigue
        low: .1
        high: .2
      fatigue_frequency_days: 4*7         # Number of days for the NPI to reach a new_value
      fatigue_min: .2                  # 0 if unspecified, clip when the NPI reaches this value.
      #fatigue_type: geometric     # if there, produce geometric fatigue (so reduction of reduction)
    Scenario1:
      template: Stacked
      scenarios:
        - Wuhan
        - UK

`hospitalization` section

There are two modules for the calculation of health outcomes. One module enables location-specific health outcome risks (e.g., accounting for differences in age distribution between location), by using the risk of various health outcomes relative to a national average. A second module specifies un-adjusted, population-wide health outcome risks and requires slightly different parameters than the location-specific calculation, which is preferred.

Location-specific `hospitalization` calculations

A location-specific age-adjustment requires the existence of a "geoid params" file.

For county-level models in the US, this file is already provided in COVIDScenarioPipeline/sample_data/geoid-params.csv and you need only to set hospitalization::run_age_adjust to TRUE.
For models outside of the US, you will need to create this geoid params file (see the Getting Started page for Non US locations under "Calculate age-specific outcomes parameters for each district"). You will also need to specify the path to this file under spatial_setup::geoid_params_file (see above) and set hospitalization::run_age_adjust to TRUE.

Config Item	Required?	Type/Format	Description
paths::output_path	required	path to folder
paths::run_age_adjust	required	boolean
parameters::time_hosp	required	Two numbers (log median, log sd)	time from symptom onset to hospitalization admission (in days)
parameters::time_disch	required	Two numbers (log median, log sd)	time from hospitalization to hospital discharge
parameters::time_ICU	required	Two numbers (log median, log sd)	time from hospital admission to ICU admission
parameters::time_ICUdur	required	Two numbers (log median, log sd)	time spent in ICU
parameters::time_vent	required	Two numbers (log median, log sd)	time from ICU admission to ventilator use
parameters::time_ventdur	required	Two numbers (log median, log sd)	time spent on ventilator
parameters::p_death	required	probability	probability of death given infection
parameters::p_death_names	required	probability
parameters::p_hosp_inf	required	probability	probability of hospitalization given infection
parameters::time_onset_death	required	Two numbers (log median, log sd)	time from symptom onset to death

hospitalization:
  paths:
    output_path: hospitalization
    run_age_adjust: TRUE
  parameters:
    time_hosp: [log(7), 0.3]
    time_disch: [log(11.5), log(1.22)]
    time_ICU: [log(3), 0.3]
    time_ICUdur: [log(8), 0.2]
    time_ventdur: [log(7), 0.2]
    time_vent: [log(1), 0.4]
    p_death: [.0025, .005, .01]
    p_death_names: ["low","med","high"]
    p_hosp_inf: [0.025, 0.05, 0.1]
    time_onset_death: [2.84, 0.52]

Un-adjusted population-wide config options for `hospitalization` calculations

Config Item	Required?	Type/Format	Description
paths::output_path	required	path to folder
parameters::time_hosp	required	Two numbers (log mean, log sd)	time from symptom onset to hospitalization admission (in days)
parameters::time_disch	required	Two numbers (log mean, log sd)	time from hospitalization to hospital discharge
parameters::time_death	required	Two numbers (log mean, log sd)	time from hospitalization to death
parameters::time_ICU	required	Two numbers (log mean, log sd)	time from hospital admission to ICU admission
parameters::time_ICUdur	required	Two numbers (log mean, log sd)	time spent in ICU
parameters::time_vent	required	Two numbers (log mean, log sd)	time from ICU admission to ventilator use
parameters::p_death	required	probability	probability of death given infection
parameters::p_death_names	required	probability
parameters::p_death_rate	required	probability	probability of death given hospitalization (single value only)
parameters::p_ICU	required	probability	probability of ICU admission given hospitalization
parameters::p_vent	required	probability	probability of ventilation given ICU admission

hospitalization:
  paths:
    output_path: hospitalization
  parameters:
    time_hosp: [1.23, 0.79]
    time_disch: [log(11.5), log(1.22)]
    time_death: [log(11.25), log(1.15)]
    time_ICU: [log(8.25), log(2.2)]
    time_ICUdur: [log(16), log(2.96)]
    time_vent: [log(10.5), log((10.5-8)/1.35)]
    p_death: [.0025, .005, .01]
    p_death_names: ["low","med","high"]
    p_death_rate: 0.1
    p_ICU: 0.32
    p_vent: 0.15

`report` section

The report section is completely optional and provides settings for making an R Markdown report. For an example of a report, see the Supplementary Material of our preprint

If you wish to include it, here are the options.

Config Item	Required?	Type/Format	Description
data_settings::pop_year		integer
plot_settings::plot_intervention		boolean
formatting::scenario_labels_short		list of strings; one for each scenario in `interventions::scenarios`
formatting::scenario_labels		list of strings; one for each scenario in `interventions::scenarios`
formatting::scenario_colors		list of strings; one for each scenario in `interventions::scenarios`
formatting::pdeath_labels		list of strings
formatting::display_dates		list of dates
formatting::display_dates2	optional	list of dates	a 2nd string of display dates that can optionally be supplied to specific report functions

report:
  data_settings:
    pop_year: 2018
  plot_settings:
    plot_intervention: TRUE
  formatting:
    scenario_labels_short: ["UC", "S1"]
    scenario_labels:
      - Uncontrolled
      - Scenario 1
    scenario_colors: ["#D95F02", "#1B9E77"]
    pdeath_labels: ["0.25% IFR", "0.5% IFR", "1% IFR"]
    display_dates: ["2020-04-15", "2020-05-01", "2020-05-15", "2020-06-01", "2020-06-15"]
    display_dates2: ["2020-04-15", "2020-05-15", "2020-06-15"]

Config File Options - HopkinsIDD/COVID19_Minimal GitHub Wiki

Overview

Description of Types/Formats

Sections

Global Header

spatial_setup section

geodata file

Example geodata file format

mobility file

Example mobility file format

importation section (optional)

importation::param_list

seeding section

seir section

interventions section

interventions::settings::[setting_name]

hospitalization section

Location-specific hospitalization calculations

Un-adjusted population-wide config options for hospitalization calculations

report section