Data Paths and Datasets instructions - rosepearson/GeoFabrics GitHub Wiki

These options control behaviour associated with including input data in each stage and are all grouped under the data_paths and datasets key-values.

Data Paths [Required]

The data_paths specify the local location where data is to read from or written to. All paths should be a forward-slash separated file path. All data_path aside from the local_cache should be relative to the local_cache or absolute paths. Some data path keywords are considered in all GeoFabrics framework stages and others only in some framework stages. Accepted keywords are:

keyword required type default stage description
local_cache yes str - all The location to download a copy of remotely sourced data, and of the generated 'geofabrics.log' file.
subfolder yes str results all The folder to store generated results in.
downloads yes str downloads all The folder to store any downloaded raster, vector or LiDAR data in.
raw_dem yes str raw_dem.nc dem The name (can include full file path) to save the DEM generated from just LiDAR and any specified coarse DEM.
result_dem yes str generated_dem.nc dem The name (can include full file path) to save the hydrologically conditioned DEM to.
result_geofabric yes str generated_geofabric.nc roughness The name (can include full file path) to save the geofabric with hydrologically conditioned DEM and roughness layers to.
land no str - all The location of a local copy of a polygon defining the land (i.e. NZ Coastline). If not defined here or in the datasets section no offshore area will be defined.
ocean_contours no list of str - dem The location of any local vector data defining the bathymetry contours (i.e. NZ Depth Contours). By preference, instead define in the datasets section.
coarse_dems no list of str - dem The location of a local copy of a DEM to use where there is no LiDAR data. By preference, instead define in the datasets section.
rivers no list of dict's - dem Each dict contains an extents and elevations key with the location of local vector data defining a river extents (extents) and bathymetry point elevations (elevations).
waterways no list of dict's - dem Each dict contains an extents and elevations key with the location of local vector data defining a waterways extents (extents) and bathymetry point elevations (elevations).
measured_sections yes str - measured The path to a geometry file of z-polylines defining the measured long-sections across the river.
riverbanks yes str - measured The path to a geometry file of two polylines defining each bank of the river.
thalweg no str - measured The path to a polylines defining a river thalweg.

Datasets

the datasets keyword and contents is technically optional. It specifies LiDAR, raster and vector datasets, where a datasets includes files and CRS information. These can be located locally (at least for LiDAR) or remotely (accessed through an API). In the case of local access it's contents specifies the location of the data, and in the case of remote API accessed data its contents specifies the data services where data is to be read from. The remote data is locally cached in specified downloads folders which defaults to being within the local_cache. Accepted apis keywords are:

  1. lidar: *open_topography - Use this keyword if LiDAR data is to be pulled from OpenTopography. This must be followed by a dictionary containing a single (note support for multiple datasets will be added in time - please create an issue if you need this now) LiDAR dataset to download as a key (i.e. "open_topography": { "NZ18_Banks": true }. In the case that the .LAZ files for a dataset do not contain full datum information (i.e. Wellington_2013), you can specify the .LAZ CRS information using "open_topography": { "Wellington_2013": { "crs": { "horizontal": 2193, "vertical": 7839 } } }.
    • local - Use this keyword if you have a locally stored LiDAR dataset (must have a tile index file). Either specify the dataset name and the dataset_folder which contains the LiDAR files and tile index file (must be of form {dataset_name}_TileIndex.zip, or use the file_paths keyword to specify the LiDAR files, and the tile_index_file keyword to specify the tile index file with complete path.
  2. vector: linz or lris or stats_nz - Specify these if you want vector data to be downloaded from either the LINZ or LRIS Data Service. These requires the same keywords.
    • key - This is mandatory for both and should contain YOUR_API_KEY for that data service as a string.
    • land or bathymetry_contours [Both optional] - These are the accepted vector values. Both are optional, although there is not point specifying your API key if you are not going to specify one of these layers to download. In either case you should then specify a layer and optionally the geometry_name of that layer. See geoapis: Basic Usage for more details on the geometry_name. The layer is the unique identifier in the URL when you view the dataset on the dataservice. (i.e. 51153 for an example of a land vector on the LINZ LDS, or 50448 for and example of a bathymetry_contours.
  3. raster: linz or lris or stats_nz - Specify the rater layer to download and
    • key - This is mandatory for both and should contain YOUR_API_KEY for that data service as a string. For a raster it must be a manual key.
    • coarse_dems [Optional] - This is the accepted raster value. You should specify a layer. See geoapis: Basic Usage for more details. The layer is the unique identifier in the URL when you view the dataset on the dataservice. (i.e. 51768 for an example of a coarse_dems.

Datasets Mapping

This defines the order of precedence and value included in the lidar_source layer. It is required if you have multiple LiDAR datasets. If you have only one LiDAR dataset it will be auto-populated if it is not included. An example is shown below. Note that all LiDAR datasets must be given a mapping value and these values must be unique.

"dataset_mapping": {
   "lidar": {
       "dataset_name_1": 1,
       "dataset_name_2": 2
   }
}