FcigDM_SAM - AtlasOfLivingAustralia/ala-datamob GitHub Wiki
This is an implementation of a darwincore export, for one of the FCIG-OZCAM participants.
Item | Short URL | Details (or long URL) | |||||||||||||||
This wiki page | http://goo.gl/paUZ9 | FcigDM_SAM | |||||||||||||||
Source data system |
|
||||||||||||||||
Output data |
Darwincore csv (simple-dwc) format with non-standard FCIG extensions
Data availability:
|
||||||||||||||||
Completeness model | http://goo.gl/Qm8lT | Google docs -> Data management -> CompletenessDwC -> sam.dwccm-26 | |||||||||||||||
Source code | http://goo.gl/uOhTK | https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/sam | |||||||||||||||
Usage doco | http://goo.gl/9Ugfh | https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/sam/sam%20cms%20doco.20130103.pdf | |||||||||||||||
Final status report | Google docs ➢ Communications ➢ Data management ➢ Mobilisation reports ➢ finalreport.sam.odt | ||||||||||||||||
finalreport.sam.pdf (under the same directory) |
From usage documentation https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/sam/sam%20cms%20doco.20130103.pdf...
There are five parts to the exporter:
- the main script (dwc_spc.sh),
- the disciplines-list (disciplines-list),
- the sub-script (dwc_spsub.sh),
- the data mapping scripts (ozdc_full.awk, ozdc_id.awk), and
- the update script (samupdate.sh).
The first export component is a bash shell script, dwc_spc.sh, which is the entry point for running an export – this script prepares the export directory, reads in disciplines-list and calls the sub-script dwcdm2dsx.sh for each non-comment line (no leading #), bundles the export on completion and sends to specified servers using sftp.
The second export component is the text file, disciplines-list, which controls the behaviour of the main script dwdm2.sh. Disciplines matching the CatCollectionName field should be entered here, one per line. Comment lines (beginning with #) and blank lines are ignored. Comment out disciplines to do a partial export. If you delete or rename this file, dwc_spc.sh will rebuild it from the database. Note: this is a costly operation (roughly 2 hours) and no subsequent exports will occur, to allow for any unwanted disciplines to be excluded by deletion or comment.
Activity diagram for https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/sam/dwc_spc.sh
each line in the disciplines-list, and handles exporting the full list of current id's, as well as the partial or full export
depending. Scripts ozdc_full.awk and ozdc_id.awk are called by dwc_spsub.sh, and handle mapping between
an emu export and a darwincore csv. The output file DISCIPLINE-dwcid.csv is converted by ozdc_id.awk while
DISCIPLINE-dwcdata.csv is converted by ozdc_full.awk.
Activity diagram for https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/sam/dwc_spsub.sh The fourth export component are the awk scripts, ozdc_full.awk and ozdc_id.awk. These
scripts handle the mapping between an emu export and a darwincore csv – they are called on by dwc_spsub.sh
to convert data inline before output csv files are written by dwc_spc.sh.
Activity diagram for https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/sam/ozdc_full.awk and https://github.com/AtlasOfLivingAustralia/ala-datamob/tree/master/biodomains/fcig-ozcam/sam/ozdc_id.awk