AtlasSubsysIngest - AtlasOfLivingAustralia/ala-datamob Wiki

Introduction

This is not an attempt to document the ingest sub-system - merely a pointer to the code base, along with high-level descriptions (based on the understanding of someone who hasn't developed or used these tools, merely relied on their outcomes)

The overall data provision process

This wiki page is concerned only with the *bold steps in the process applied to occurrence records; preliminary steps are included for context:

  1. data provider and the atlas establish new system account, creating a new data resource for each discrete source-system, eg: a collection's specimen-record management system, or species profile database ... http://collections.ala.org.au/datasets
  2. the atlas generates a SFTP upload account on the upload server
  3. *data provider generates an export in simple-dwc csv format
  4. *data provider uploads the compressed export
  5. *ingest subsystem periodically checks the sftp server for new files
  6. *a new file is found, downloaded, unpacked and a record loading process triggered
  7. (note: planned behaviour only at feb 2013) *a log of the overall ingest process is left on the sftp server

Requirements for data files presented to ingest

This document lives at: http://goo.gl/qzioQ or ‘Automated ingest: file naming conventions’ under Google docs➢Communications➢Data management➢Mobilisation - public