Home - AtlasOfLivingAustralia/ala-datamob GitHub Wiki

Welcome

Welcome to the Atlas of Living Australia's data mobilisation portal.

Here you will find resources to help you share your data with everyone using the ALA along with any tools or code that we have developed, or others have shared with us, to do this.

Data mobilisation

In short, data mobilisation (DM) is used to refer to the standardised, automated migration of data to and from the Atlas, for the purposes of sharing and further data analysis. DM is distinct from the more manual, adhoc methods of sharing data, or sharing data by default when using an Atlas sub-system for data management.

Exporting from the data provider

Where an institution curates their data in systems as part of their own processes, they may then wish to share their data with the public through the ALA, or take advantage of some of the Atlas' systems to perform data analysis.

This organisation then hosts system code that automates (or makes repeatable) the translation of these data into a standard form (e.g. DarwinCore or HISPID); this process pushes data to the Atlas periodically, or somehow makes them available for harvest on demand by any interested party. Data mobilisation is the term we use to describe this concept.

Importing to the data provider

Data repatriation can be considered the return and integration of any post-analysis or value-added data into a data provider's system boundary, to potentially improve data within their source systems. At least two distinct activities are represented by this concept and in both cases, it might be desirable for the source system to store or make available these efforts by their 'downstream' data consumers:

  1. value-added information associated with the data in the source system, but not specifically of the same type (class, form, schema, data standard …), and
  2. existing (changed) or new records of the same type found in the source system.

Conceptual menu...

Specific implementations of DM (coupled to their source systems) for faunal collections and herbaria
Status of the partner institutions
Domain-specific data standards: Darwincore for faunal collections and HISPID for herbaria
Domains in biodiversity - arbitrary boxing of groups (either core data-gathering or supplementary aka derived,secondary,satellite,... data-generating)
Relevant Atlas sub-systems, particularly the Biocache, Ingest and Quality subsystems
Classes of data and how they surface in the Atlas]
Identifying and communicating relationships (derived collection items, inferred records, duplicates ...)
Spatial data and some of its considerations
A logical, more abstract DM process
DM completeness - external link (fitness for use, quality at source)
Algorithms, analyses, processes for DM

Hand over everything at once...

You can see all content (wiki, source-code, ...) through the googe: http://code.google.com/p/ala-datamob/source/browse/#svn; the wiki only is found here: http://code.google.com/p/ala-datamob/source/browse/#svn%2Fwiki

In addition to the above, you can take a local copy of all the content (except the DM completeness section) by using the following svn url:

Contact us...

It's early days for this wiki; in time, things might evolve from a scantily clad jumble of notes - feel free to make comments through the wiki interface or directly... if you have any questions, please do not hesitate to contact us using the following (in order of preference):

⚠️ **GitHub.com Fallback** ⚠️