FcigDM_AM - AtlasOfLivingAustralia/ala-datamob GitHub Wiki

Australian Museum: primary collection management system darwincore export

Introduction

This is an implementation of a darwincore export, for one of the FCIG-OZCAM participants.

Artefacts and synopsis

Item Short URL Details (or long URL)
This wiki page http://goo.gl/w1Z9m FcigDM_AM
Source data system
Collection management software KE EMu on Linux
Database backend texpress
Exporter's execution environment bash (linux command shell)
Adhoc query texql
Bulk export method texkfdump, texexport
Schema reporting texdescribe
DwC mapping awk script
Compression, transmission gzip, sftp to upload.ala.org.au
Output data Darwincore csv (simple-dwc) format with non-standard FCIG extensions

institutionCode "AM"
dcterms:type "PhysicalObject"
basisOfRecord "PreservedSpecimen"

Data availability:

AM data before export http://australianmuseum.net.au/Australian-Museum-Collection-Search
AM data at export not generally available - contact data manager
AM data after atlas (biocache) ingest http://goo.gl/Gkiba
Completeness model http://goo.gl/dB4W6 Google docs -> Data management -> CompletenessDwC -> am-emu.dwccm.26
Source code http://goo.gl/kI0d8 https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/am
Usage doco http://goo.gl/Aw6Zy https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/am/am%20cms%20doco.20130103.pdf
Final status report http://goo.gl/8E2b0 Google docs ➢ Communications ➢ Data management ➢ Mobilisation reports ➢ finalreport.am.odt
http://goo.gl/tEioQ finalreport.am.pdf (under the same directory)

Behavioural diagrams

From usage documentation https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/am/am%20cms%20doco.20130103.pdf...

There are five parts to the exporter:

dwcdm2.sh

The first export component is a bash shell script, /amexport/dwcdm2.sh, which is the entry point for running an export – this script prepares the export directory, reads in disciplines-list and calls the sub-script dwcdm2dsx.sh for each non-comment line (no leading #), bundles the export on completion and sends to specified servers using sftp.

The second export component is the text file, /amexport/disciplines-list, which controls the behaviour of the main script dwdm2.sh. Disciplines matching the SecDepartment field should be entered here, one per line. Comment lines (beginning with #) and blank lines are ignored. Comment out disciplines to do a partial export. If you delete or rename this file, dwcdm2.sh will rebuild it from the database. Note: this is a costly operation (roughly 2 hours) and no subsequent exports will occur, to allow for any unwanted disciplines to be excluded by deletion or comment.


Activity diagram for https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/am/dwcdm2.sh

dwcdm2dsx.sh

The third export component is a bash shell script, /amexport/dwcdm2dsx.sh, which is called by dwcdm2.sh for
each line in the disciplines-list, and handles exporting the full list of current id's, as well as the partial or full export
depending. Scripts ozdc_full.awk and ozdc_id.awk are called by dwcdm2dsx.sh, and handle mapping between
an emu export and a darwincore csv. The output file DISCIPLINE-dwcid.csv is converted by ozdc_id.awk while
DISCIPLINE-dwcdata.csv is converted by ozdc_full.awk.


Activity diagram for https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/am/dwcdm2dsx.sh

ozdc_full.awk, ozdc_id.awk

The fourth export component are the awk scripts, /amexport/ozdc_full.awk and /amexport/ozdc_id.awk. These
scripts handle the mapping between an emu export and a darwincore csv – they are called on by dwcdm2dsx.sh
to convert data inline before output csv files are written by dwcdm2.sh.


Activity diagram for https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/am/ozdc_full.awk and https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/am/ozdc_id.awk
⚠️ **GitHub.com Fallback** ⚠️