GBIF Metadata info - inbo/vlaams-biodiversiteitsportaal GitHub Wiki

In this section, we will go through the necessary minimum metadata fields for a GBIF upload (from the VBP/INBO point of view). This is based on the inbo/datapublication wiki.

Metadata

Basic metadata

  • Title
    The title is a human readable name for the dataset, keep it short and descriptive. It always ends with the geographic scope “... (in) Flanders, Belgium”.
    For occurrence datasets, the title generally starts with the name of the originating database, followed by space hyphen space, so datasets of the same origin appear together (Meetnetten.be - ..., VIS - ..., Waarnemingen.be - ...).
    Example:
    Bat observations in the Antwerp forts, 2015-2017, Flanders, Belgium;
    Watervogels - Wintering waterbirds in Flanders, Belgium;
    Checklist of non-native freshwater fishes in Flanders, Belgium

  • Short Name
    The shortname is used to reference the dataset on the IPT or GitHub = it is part of the URL. It should be lowercase, hyphenated and stable. It cannot be changed once the dataset is created. For occurrence datasets, the shortname generally starts with the name of the originating database, so datasets of the same origin appear together (meetnetten-, vis-, inboveg-, ...). The shortname always ends with the dataset type (-occurrences or -checklist). The use of -events has been discontinued, because datasets might change format (Occurrence Core vs Event Core).
    Example
    watervogels-Bruges2017-occurrences (= db + occurrences)
    vis-inland-zoetwater-occurrences (= db + subset + occurrences)
    meetnetten-amfibieen-fuiken-occurrences (= db + subset + subset + occurrences)
    alien-fishes-checklist (= topic + checklist)

  • Publishing Organization
    This is set in the IPT when registering the dataset and cannot be changed later. Please ensure you are publishing through the INBO IPT if you wish the publisher to be set as INBO.

  • Type
    This is set in the IPT when you create a dataset and cannot be changed later.
    The three types are Occurrence, Sampling Event and Checklist.
    Occurrence data typically records evidence of the occurrence of a species at a specific place (and usually date). Examples include natural history collection databases, species atlas data, ...
    Where there is enough information surrounding the sampling events, it is often better to present the data as a sampling event dataset.
    Sampling event data gives detailed information regarding species occurrences in specific locations and times, accompanied by a description of the sampling method/s and/or protocol/s used to obtain the data. They are generally not opportunistic observations. Examples include monitoring data, eDNA surveys, species transects, ...
    Checklist datasets are lists of species falling under one or more categories, such as taxonomic, red list, locally invasive, national park checklists, ... If there is enough information available, it is often better to present this type of data as an occurrence dataset.

  • Subtype
    If relevant, choose the most applicable subtype from the drop-down menu.
    "Observation" is generally used for INBO occurrence datasets.
    "Inventory" is generally used for checklists, "thematic" being for topics such as invasive species.

  • Data Language
    Preferably English (so that the dataset can be as widely used as possible)

  • Metadata Language
    Preferably English (so that the dataset can be as widely used as possible)

  • Data License
    For INBO staff:
    Choose Public Domain CC0 1.0 This follows the INBO opendatabeleid. Please contact the datastewards if you need to discuss this.
    For other parties wanting to publish data via GBIF in order to include it in the data available via the VBP:
    GBIF only allows data to be published with either a Public Domain CC0 1.0, an Attribution CC BY 4.0 or an Attribution-NonCommercial CC BY-NC 4.0 data use licence. If you cannot publish all of your data under these terms, please contact [email protected] for further advice and support.
    More information about data use licences and GBIF can be found here.

  • Description
    Here, you can give a brief description of the dataset. Note that this section will be automatically added to PURE under “description”.
    The description should include details of any generalized and/or withheld data - if this is the case, be as detailed as possible; give reasoning and how to access the data, if that is possible.
    If the dataset comes from a database (INBO or otherwise), please mention this in part 1 of the description. Some template texts can be copied and pasted into your description (see the end of this section).
    If the publication is of an INBO dataset, you should include the following text regarding the INBO terms of use in the description:
    "We have released this dataset to the public domain under a Creative Commons Zero waiver. We would appreciate it if you follow the INBO norms for data use (https://www.inbo.be/en/norms-data-use) when using the data. If you have any questions regarding this dataset, don't hesitate to contact us via the contact information provided in the metadata or via [email protected]."
    You can also choose to acknowledge the project and/or funding which produced the dataset.
    INBOVeg dataset
    Nothing obscured
    This dataset comes from INBOVEG, the INBO vegetation record database (also part of the "Global Index of Vegetation-plot databases" and the "European Vegetation Archive".
    The GBIF dataset is the same as the data in INBOVeg. The relevant identification code is XXXXX.
    Issues related to the dataset can be submitted here: https://github.com/inbo/ (or other link/contact point).
    Something obscured
    This dataset comes from INBOVEG, the INBO vegetation record database (also part of the "Global Index of Vegetation-plot databases" and the "European Vegetation Archive".
    The GBIF dataset contains certain partially redacted locations for the following species: XXXXX. This was done to protect the locations of this/these sensitive species, following the INBO guidelines (link in creation). The relevant identification code is XXX. For the precise locations, please contact XXXXX.
    Issues related to the dataset can be submitted here: https://github.com/inbo/ (or other link/contact point).

  • Maintenance: Update frequency
    If you already have a good idea of how often the dataset will be updated, choose the most appropriate option in the drop-down.
    If you aren’t sure, there is no problem leaving this on “Unknown”.

  • Maintenance Description
    You can give more details explaining the frequency of/type of maintenance planned, if wished.

Contacts

For all contacts, please give at least a first and last name, email address and ORCID as a personnel identifier. When giving an organisation, use full names (e.g. Research Institute for Nature and Forest (INBO) ). The address, telephone number or position of contacts is unnecessary (and subject to change in any case).
If you want to acknowledge nameless contributors, for example ‘all grass identifiers’, this can be done via the acknowledgements or the description.

  • Resource Creators
    Resource creators are all the people who contributed significantly to the dataset. This includes the main researcher, collaborators, significant data collectors and the people publishing the dataset. The order of resource creators translates to the order of the authors in the dataset citation.

  • Metadata Providers
    Metadata providers include those who wrote the dataset metadata. In most cases this includes the main researcher and the people who took care of publishing the dataset.

  • Associated Parties
    Preferably leave this empty and add involved people as resource creators instead.

Acknowledgements

Metadata that acknowledges funders and other key contributors that is not already mentioned elsewhere.

Coverage and keywords

  • Geographic coverage

  • Taxonomic coverage

  • Temporal coverage

  • Keywords

Project data

Sampling methods

Citations

Additional metadata

Data guidelines

Data guidelines for record level terms

Data guidelines for occurrences

Data guidelines for checklists

Data guidelines for sampling events