Stands & Metric Layers - OurPlanscape/Planscape GitHub Wiki
Stands & Metric Layers
Planscape and Forsys utilizes the RRK datasets to run it's scenarios. These layers are available in raster form and need to be loaded in the database, for us to calculate the metrics used by Planscape.
Rationale
To keep our Forsys runs short, we decided to cache everything we could cache before-hand. All of the data is pre-calculated by a series of scripts, allowing our Forsys runs to focus just on fetching the pre-calculated data.
There are a few important tables that needs to be populates:
stands_stand
(stands/models.py::Stand
)stands_standmetric
(stands/models.py::StandMetric
)conditions_basecondition
(conditions/models.py::BaseCondition
)conditions_condition
(conditions/models.py::Condition
)conditions_conditionraster
(conditions/models.py::ConditionRaster
)
The stands_stand
stores information about our planning subdivisions, the stands. Each stand is a hexagon, created by ST_HexagronGrid
function in PostGIS. The stands_standmetric
table has a foreign key to stands_stand
and represents a list of metrics for a given condition in that stand. Each row stores the min
, avg
, max
, count
and sum
for that stand.
How to load the data
We have a list of Django management commands in order to load this data. They should be run in order:
- Create Stands (this is a simple SQL function - we need to create the management command);
- Load Conditions;
- Load Rasters;
- Load Metrics v2;
All of these commands and functions takes up several parameters, allowing you to tweak how you want to run it.
How long does it take?
To load stands
the runtime is not that bad. Last measurement is less than one hour for all stand sizes. Loading raster's also takes quite a while (separate 1 day for this - as this process cannot be run in parallel).
To load all the metrics, we have the following run-times, per condition:
- LARGE: approximately 2 minutes, per condition
- MEDIUM: approximately 10 minutes, per condition
- SMALL: approximately 60-80 minutes, per condition
Create Stands
To create the stands, we use a SQL function that should be installed in your database by our migrations. The system is configured to run the creation of LARGE
stands automatically, but not for the rest of the stand sizes (MEDIUM
, SMALL
), because they take a long time to be created.
This supports the following parameters:
- Extent (this extent represents the whole state of California)
- Length of the Hexagon. These magic numbers will give us extents with specific areas
- Size of the stand. Enumeration here
- Boolean flag to reset the stands of this size, before recreating them.
Large Stands
SELECT * FROM
create_stands(
ST_MakeEnvelope(
-2356881.4306262177415192,
1242364.3072737671900541,
-1646662.6679147812537849,
2452712.7869634097442031,
5070),
877.38267558,
'LARGE',
TRUE
);
Medium Stands
SELECT * FROM
create_stands(
ST_MakeEnvelope(
-2356881.4306262177415192,
1242364.3072737671900541,
-1646662.6679147812537849,
2452712.7869634097442031,
5070),
392.377463,
'MEDIUM',
TRUE
);
Small Stands
SELECT * FROM
create_stands(
ST_MakeEnvelope(
-2356881.4306262177415192,
1242364.3072737671900541,
-1646662.6679147812537849,
2452712.7869634097442031,
5070),
124.0806483,
'SMALL',
TRUE
);
Load Conditions
Our system depends on many conditions. This is another term for metric. Conditions represent some sort of phenomenon, that can be represented by a raster layer.
Each condition has a few attributes and are accurately described in this file. The load conditions management command parses this JSON (also allows you to specify a different JSON input) and creates the corresponding data in conditions_basecondition
and conditions_condition
tables.
With these conditions in place, we can load the rasters.
Parameters:
--conditions-file
, not required, defaults to [this file (https://github.com/OurPlanscape/Planscape/blob/main/src/planscape/config/conditions.json)--dry-run
, not required, defaults to false
This command will upsert
all the conditions found in the JSON file. This is run every time the system is deployed.
Example:
python manage.py load_conditions
Load Rasters
Conditions contains metadata about our rasters. After we load our conditions, we can load the actual data for each of these. This command is not triggered automatically, because it takes a long time to load all of the data.
Parameters:
--conditions-ids
, not required, defaults to all conditions. You can specify multiple condition ids with--condition-ids 1 2 3
, where1 2 3
are the primary key for theconditions_condition
table.--clear
, not required, defaults to true. If you set this to true, the data for a specific condition will be removed before inserting the new calculate data.
Example:
python manage.py load_rasters --condition-ids 1 2 3
Load Metrics v2
Given that we have our conditions in place and all the rasters inside the database, we can start calculating stands_standmetric
values. For each stand, we try to intersect and aggregate values from a condition in the database. In this process we extract basic information, such as min
, avg
, max
, sum
and count
(of pixels).
This command processes each condition against all the intersecting stands, and inserts the corresponding records in the stands_standmetric table
.
warning: there are two versions of this command. the first version is very slow and will be deprecated.
Parameters:
--condition-ids
- list of integers, each id identifying a specific condition. Defaults to all conditions.--size
- one ofSMALL
,MEDIUM
orLARGE
, defaults to medium.--clear
, boolean determining if we should delete the data for a certain condition being processed and a stand size.--max-workers
, number of maximum workers used by the program to insert the data. defaults to 4. Increase it if you have more threads available in your system.
Example:
python manage.py load_conditions --size MEDIUM --condition-ids 1 2 3 --clear
the example above will clear the data before loading, only for the specified conditions and stand size.
Reviewing conditions that have been processed
If this process gets interrupted, you can review the conditions that have been processed with an SQL statement like the following statement. For instance, this will show the condition_ids for SMALL stands that relate to the Southern California region:
SELECT DISTINCT stands_standmetric.condition_id as condition_id
FROM stands_standmetric
JOIN stands_stand ON stands_stand.id = stands_standmetric.stand_id
JOIN conditions_condition ON stands_standmetric.condition_id = conditions_condition.id
JOIN conditions_basecondition ON conditions_condition.condition_dataset_id = conditions_basecondition.id
WHERE conditions_basecondition.region_name = 'southern-california'
AND stands_stand.size = 'SMALL';