Data guidelines for record level terms - inbo/vlaams-biodiversiteitsportaal GitHub Wiki

Record level terms

Term Status
type Required *
modified Do not use
language Required
license Required
rightsHolder Required
accessRights Strongly recommended
bibliographicCitation Do not use
references Share if available
institutionID Recommended *
collectionID Share if available
datasetID Required *
institutionCode Required
collectionCode Strongly recommended *
datasetName Required
ownerInstitutioncode Do not use
basisOfRecord Required
informationWithheld Strongly recommended *
dataGeneralizations Required *
georeferenceRemarks Recommended *
dynamicProperties Required *
identificationVerificationStatus Strongly recommended

* Conditional, see Details section

Details

type

Required only for occurrence datasets

Event

modified

Information we might have in our databases is often not reflective of all changes, such as changes in related tables or changes in the mapping. A better metric is the last modified information calculated by GBIF.

language

Must be English

en

license

http://creativecommons.org/publicdomain/zero/1.0/

rightsHolder

Organization who has the rights to the data and in the case of multiple rights holder, the organization who managed/made the decision to release those rights under CC0. Is often the same as publishing organization. Should be a single organization.

  • INBO - for INBO datasets
  • Acronym of publishing organization - for other datasets
    • Example: BGM for Botanical Garden Meise

accessRights

For INBO datasets we use this term (cf. VertNet) to link our norms for data use

  • https://www.inbo.be/en/norms-data-use - for INBO datasets

bibliographicCitation

This field should not be used, it just stuffs the record with a citation that is easily outdated. Metadata is the place for citation. For checklists, the only valid use of this term is to indicate the source of the taxon record (a field source is not available in the taxon core, in contrast with distribution extension).

references

Should be a URL to that record on a public website.

Example:

http://waarnemingen.be/waarneming/view/113530380

institutionID

The recommended, widely-used registry for institutions is the Research Organization Registry (ROR).

Example:

https://ror.org/00j54wy13

collectionID

In negotiation. Share if available: for instance the GRSciColl UUID, the UUID of the collection given by the GBIF Registry, could be used.

datasetID

Should be the full URL DOI of the dataset. Is generated after first publication of the dataset on GBIF, so can only be included on second publication.

Examples:

  • https://doi.org/10.15468/wtda1m - for the Manual of Alien Plants Belgium
  • https://doi.org/10.15468/2dboyn - for the Catalogue of the Rust Fungi of Belgium

institutionCode

Acronym of the organization who is custodian of the data. Should be the same as rightsholder. Should be a single organization.

  • INBO - for INBO datasets
  • Acronym of publishing organization - for other datasets
    • Example: BGM for Botanical Garden Meise

collectionCode

Strongly recommended if data are kept in a recognizable data system. Use acronym of data system that holds the data. Avoid spaces.

Examples:

NBN
VIS
ABV
InboVeg
UvA-BiTS

datasetName

Title of the published dataset (same as title in metadata). Is useful as a human readable name of originating dataset in aggregated data.

ownerInstitutionCode

Do not use this field, it is too similar in definition to rightsHolder.

basisOfRecord

Is a fixed vocabulary:

PreservedSpecimen
FossilSpecimen
LivingSpecimen
HumanObservation
MachineObservation
MaterialSample
Occurrence

informationWithheld

Strongly recommended if substantial information is withheld (measurements, location, etc.).

Examples:

  • original locations available upon request
  • see metadata

If mapped, indicate it in second paragraph of description.

dataGeneralizations

Required if data are generalized. As this affects use of the data, it is preferred to have a short sentence rather than referring to metadata.

Example:

Coordinates are generalized to a 5x5km UTM grid

georeferenceRemarks

Recommended if data are generalized.

Example:

Coordinates are centroid of used grid square

dynamicProperties

Required if a low and a high resolution version of a dataset is appropriate in function of the Flemish Biodiversity Portal (VBP):

  • {"rbac":false,"rbac_allowed":"HIGHRES"} for the low resolution version
  • {"rbac":true,"rbac_allowed":"HIGHRES"} for the high resolution version

If included, format as JSON. Avoid use for organisms' measurements: then use measurements or facts extension instead.

identificationVerificationStatus

Strongly recommended, controlled vocabulary to optimally filter occurrences in the VBP.

Examples:

  • validated
  • validated, high probability