Adding new covariates - HopkinsIDD/cholera-mapping-pipeline GitHub Wiki
This page refers to adding new covariates in the old pipeline.
Creating a config entry
The first step to adding a covariate
Static in Time
Config entry should look like this:
dist_to_lakes:
name: distance_to_lakes
description: Distance to closest lake
alias: dist_lakes
abbr: dl
dir: lake_dist/lake_dist_5km_raster.tif
type: static
time_aggregator: ~
space_aggregator: average
transform: ~
unit: km
name
: Name of the covariate. Can be used to access covariate from pipeline.description
: High level description of covariate. Unused by pipeline.alias
: Short name of covariate. Can be used to access covariate from pipeline.abbr
: Really short name for covariate. Used to create filenames for pipeline results using this covariate.dir
: Path to single geotiff file. Called because it is a directory for temporal files.type
: always "static"time_aggregator
: always "~"space_aggregator
: function to use to aggregate the data of this type across space. Used by the pipeline to aggregate different spatial scalestransform
: function to use to transform the data. Used by the pipeline to transform the data after aggregation (I think)unit
: Units associated with this data. Unused by the pipeline
Variable in Time
pop:
name: population
description: population in each gridcell
alias: pop
abbr: p
dir: pop/
type: temporal
time_aggregator: average
res_time: 1 years
space_aggregator: sum
transform: ~
unit: number
name
: Name of the covariate. Can be used to access covariate from pipeline.description
: High level description of covariate. Unused by pipeline.alias
: Short name of covariate. Can be used to access covariate from pipeline.abbr
: Really short name for covariate. Used to create filenames for pipeline results using this covariate.dir
: Directory containing the netcdf files. Historically, we have used 1 netcdf file per time slice. I am not sure if this is required.type
: always "temporal"time_aggregator
: function to use to aggregate data of this type across time. Used by pipeline for aggregating to different time scales.res_time
: The time resolution of the flat files. Used by pipeline for determining gridsize in time.space_aggregator
: function to use to aggregate the data of this type across space. Used by the pipeline to aggregate different spatial scalestransform
: function to use to transform the data. Used by the pipeline to transform the data after aggregation (I think)unit
: Units associated with this data. Unused by the pipeline
Converting to netcdf4
Most of the data sets we use are not originally netcdf4 (.nc) files. In order to convert, please use the taxdat::write_netcdf
function.