Mobilization Steps - VertNet/toolkit GitHub Wiki

Index to Documentation

Workflow Steps

The data mobilization workflow for VertNet involves the following steps:

Pre-publication data preparation with VertNet toolkit ("migrator"): https://github.com/VertNet/toolkit/wiki
Publish to IPT: http://ipt.vertnet.org:8080/ipt/
Update data set metadata in VertNet Carto resource_staging table. When complete, use the SQL window to execute

DELETE from resource

, then

INSERT INTO resource
(
cartodb_id, title, url, created_at, updated_at, the_geom, eml, dwca, pubdate, orgname, description, emlrights, contact, email, icode, ipt, count, citation, networks, collectioncount, orgcountry, orgstateprovince, orgcity, source_url, migrator, license, lastindexed, gbifdatasetid, gbifpublisherid, doi) 
SELECT 
cartodb_id, title, url, created_at, updated_at, the_geom, eml, dwca, pubdate, orgname, description, emlrights, contact, email, icode, ipt, count::integer, citation, networks, collectioncount, orgcountry, orgstateprovince, orgcity, source_url, migrator, license, lastindexed, gbifdatasetid, gbifpublisherid, doi 
from resource_staging 
where ipt=True and networks like '%Vert%'

Harvest to Google Cloud Storage with gulo: https://github.com/VertNet/gulo/wiki/Harvest-Workflow
Update harvestfolder field in VertNet CartoDB table 'resource_staging'
Export resource_staging.csv from CartoDB
Run post-harvest processor check_harvest_folders.py for data sets in resource_staging.csv
Run post-harvest processor harvest_resource_processor.py to process data sets in resource_staging.csv
Check Google Cloud Storage directory tree vertnet-harvesting/processed for duplicates and counts
Remove data sets from the index that have had changes to the identifier scheme using the dwc-indexer
Index any datasets that need to be updated: https://github.com/VertNet/dwc-indexer/wiki/Index-Workflow
Load files from processed folders into BigQuery for data sets of interest specified by GCS directory using post-harvest processor bigquery-loader.py
Create Taxon Subset snapshots: https://github.com/VertNet/post-harvest-processor/wiki/Making-Snapshots

Mobilization Steps - VertNet/toolkit GitHub Wiki

Index to Documentation

VertNet Portal

Migrating

Publishing

Harvesting

Post-Harvest Processing

Indexing

Workflow Steps