Geostore and Parameterized Queries - resource-watch/ocean-watch-data GitHub Wiki

When you "select a coastline" on the "Coastal Profile" page, the page loads the Ocean Watch JSON with country or territory specific data. image

Resources on Ocean Watch are parameterized. This means there is a placeholder in the configuration of the resource so that when a page is loaded, the placeholder is formatted with an identifier associated with a coastal country or territory. This allows Ocean Watch to serve the same configuration of data-resources for all coastal countries and territories. The Ocean Watch parametrization feature relies on the Resource Watch geostore.

Geostore

Countries and territories and their geographical boundaries used on Ocean Watch are stored in the Resource Watch geostore as geostore objects using a GeoJSON. Geostore objects are a Resource Watch "resource" and thus each country or territory geostore object can be retrieved using alpha-numeric geostore id.

Viewing geostore objects

It can be helpful to view a geostore object using the following link:

https://api.resourcewatch.org/v2/geostore/<id>/view

Example: Mozambique

Ocean Watch geostore

Ocean Watch uses a simplified version of the GADM country borders for the geostore objects. The names, descriptive information, and geostore ids are stored in the gadm36_0 table on Carto. Only coastal countries are included in the geostore. There is also a row for the world.

SELECT * FROM "wri-rw".gadm36_0 WHERE name_0 ILIKE 'Global'

There is no geometry, but its geostore id is used to parameterize charts on the intro page.

Creating Parameterized Queries

Many resources on Ocean Watch use a SQL query to load data. When this query is parameterized, the resource on Ocean Watch loads data for the specific country or territory. To create a parameterized query, follow the guidance below.

Data are queried from underlying Ocean Watch and Resource Watch datasets stored on Carto using SQL SELECT statements. All SQL statements can and should be built and tested in Carto (on the page for a data table, flip switch from "Metadata" to "SQL"). Also note that SQL queries below are written with tabs and newlines for readability, but what gets entered into the json needs to be one long, unbroken string, with all items/words separated only by spaces.

The placeholder

When writing a parameterized query, you must use a "placeholder" where the geostore id can be formatted when loaded on Ocean Watch. The application will replace the placeholder with the geostore id of the corresponding country. Thus the queries must include:

`WHERE {{geostore_env}} = "{{geostore_id}}"`

The geostore_env refers to the column header in the gadm36_0 table. There is a column for each environment,geostore_staging or geostore_prod. geostore_id refers to the unique 32 alphanumeric code for a country in the geostore. Since these columns are in a separate table from the data your SQL statement must include one or more join statements. After joining the tables you are ready to test your query.

1. Join the gadm table

Ocean Watch datasets will not have the geostore_staging and geostore_prod columns. These columns are only stored in the gadm36_0 table. They are stored separately, so any changes or updates to the geostore only necessitate changing this table. When querying a dataset in a parameterized query, you need to link these geostore_staging and geostore_prod columns from the gadm36_0 table using a JOIN statement.

INNER JOIN gadm36_0 gadm ON data.{{iso_field}} = gadm.gid_0

In this case, the query pulls columns from the gadm table and attaches the rows of the table that same 3-letter ISO code with rows in the dataset.

2. Join the aliasing table

When the dataset does not have ISO codes, parameterization requires associating the underlying data with the appropriate ISO codes. This is accomplished by using an "aliasing" table, where each non-standard name for a given country is linked to its standard name and ISO code. For example, a dataset with an entry for South Korea referred to that country as "Korea South", which has no matches in the gadmm countries table. In this case "Korea South" is an alias for South Korea and its ISO code is "KOR". The ow_aliasing_countries table is different than that for Resource Watch (Ocean Watch uses GADM countries instead of Natural Earth).

SELECT ROUND(value) AS value FROM ocn_024a_rw0_key_biodiversity_area_protection
INNER JOIN ow_aliasing_countries AS alias ON alias.alias = data.geoname
INNER JOIN gadm36_0 gadm ON alias.iso = gadm.gid_0

When joining tables to the gadm36_0 and/or the ow_aliasing_countries table, it is important to verify that all country rows were matched. This can be done using an SQL query. Carefully inspect any rows that were not joined, and add missing items to the aliasing table if relevant.

SELECT geoareaname FROM ocn_024a_rw0_key_biodiversity_area_protection 
LEFT OUTER JOIN ow_aliasing_countries AS alias ON alias.alias = geoareaname 
WHERE alias.alias is NULL

Be careful not to add duplicate aliases to the table. This will result in duplicated rows in the query results. To check if there are duplicate aliases in the table, use the following SQL statement:

SELECT alias, COUNT(*)
FROM ow_aliasing_countries
GROUP BY alias
HAVING COUNT(*) > 1

3. Test the SQL statement

When testing SQL statement, it will not execute with WHERE {{geostore_env}} = "{{geostore_id}}". Swap out these parameters for an environment and geostore id when testing

SELECT ROUND(value) AS value FROM ocn_024a_rw0_key_biodiversity_area_protection 
INNER JOIN ow_aliasing_countries AS alias ON alias.alias = geoareaname 
INNER JOIN gadm36_0 gadm ON alias.iso = gadm.gid_0 
WHERE timeperiod = 2020 AND gadm.geostore_prod ILIKE '9bc50cce7f5b6ebd0452a3b839708ba9'

4. Test the query

Once you have written the appropriate SQL statement, you will use the statement to query the Carto API. To do this, simply add the pre-fix url https://wri-rw.carto.com/api/v2/sql?q= in front of your query. Remember, the query passed to the API will not be executable without the geostore identifier being inserted, so keep these parameters swapped for actual values.

https://wri-rw.carto.com/api/v2/sql?q=SELECT ROUND(value) AS value FROM ocn_024a_rw0_key_biodiversity_area_protection INNER JOIN ow_aliasing_countries AS alias ON alias.alias = geoareaname INNER JOIN gadm36_0 gadm ON alias.iso = gadm.gid_0 WHERE timeperiod = 2020 AND gadm.geostore_prod ILIKE '9bc50cce7f5b6ebd0452a3b839708ba9'

Copy and paste the complete query url into your browser. If the result is as expected, swap the environment and the geostore identifier back to the parameterized versions.

https://wri-rw.carto.com/api/v2/sql?q=SELECT ROUND(value) AS value FROM ocn_024a_rw0_key_biodiversity_area_protection INNER JOIN ow_aliasing_countries AS alias ON alias.alias = geoareaname INNER JOIN gadm36_0 gadm ON alias.iso = gadm.gid_0 WHERE timeperiod = 2020 AND gadm.{{geostore_env}} ILIKE '{{geostore_id}}'

Congrats! This query is ready to be used in a resource.