Important elements to validate - SIB-Colombia/sib-google-spreadsheet-validator GitHub Wiki

Important elements to validate

These elements are considered important in a Darwin Core record. Each element is recorded in a column of the sheet, with the name of the element in the first cell of each column. Therefore, the first row of the sheet contains the names of the elements to validate.

If the sheet contains too many records, after finishing the validations it can occur that the sheet stays gray. In this case, please reload the window/ tab of the spreadsheet in your browser.

occurrenceID

In the absence of a persistent global unique identifier, you can create one by a combination of identifiers in the record that will most closely make the occurrenceID globally unique. For a specimen in the absence of a bona fide global unique identifier, use for example the form:"urn:catalog:[institutionCode]:[collectionCode]:[catalogNumber]. In the sheet is validated that value in each cell is unique.

basisOfRecord

The specific nature of the data record - a subtype of the dcterms:type

modified

The most recent date-time on which the resource was changed. The sheet is validated according to standard ISO8601:2004. Example: "2007-03-01T13:00:00Z/2008-05-11T15:30:00Z"

rightsHolder

A person or organization owning or managing rights over the resource. It is validated that the cells contain only alphanumeric characters.
Example: University of California.

institutionID

An identifier for the institution having custody of the object(s) or information referred to in the record. It is validated that the cells contain only alphanumeric characters.

collectionID

An identifier for the collection or dataset from which the record was derived. For physical specimens, the recommended best practice is to use the identifier in a collections registry such as the Biodiversity Collections Index. It is validated that the cells contain only alphanumeric characters.
Example: urn:lsid:biocol.org:col:34818

collectionCode

The name, acronym, code, or initials identifying the collection or data set from which the record was derived. It is validated that the cells contain only alphanumeric characters.
Example: "COL", "ANDES", "FMB", "HPUJ".

datasetName

The name identifying the data set from which the record was derived. It is validated that the cells contain only alphanumeric characters.
Example: Muestreo de Mamíferos de la Cuenca.

catalogNumber

An identifier (preferably unique) for the record within the data set or collection. In the sheet is validated that value in each cell is unique (no repeated values in the column). Example: 2008.1334

recordedBy

A list (concatenated and separated for “;”) of names of people, groups, or organizations responsible for recording the original Occurrence. In the sheet is validated that cell only have alphanumeric characters. Example: Oliver P. Pearson; Anita K. Pearson

individualCount

The number of individuals represented present at the time of the Occurrence. In the sheet is validated that cell only have integer values. Examples: 1, 25

sex

The sex of the biological individual(s) represented in the Occurrence. In the sheet is validated in base to controlled vocabulary. Example: female, hermaphrodite

samplingProtocol

The name of, reference to, or description of the method or protocol used during an event. In the sheet the cells are validated to contain text with a length of more than one character.

eventDate

The date-time or interval during which an event occurred. The sheet is validated according tostandard ISO8601:2004. Example: 1963-03-08T14:07-0600

eventTime

The time or interval during which an event occurred. The sheet is validated according to standard ISO8601:2004. Example: 14:07-0600

habitat

A category or description of the habitat in which the event occurred. In the sheet the cells are validated that cell have a text with a length more than one character. Example: pre-cordilleran steppe

fieldNumber

An identifier given to the event in the field. Often serves as a link between field notes and the Event. In the sheet the cells are validated to contain text with a length of more than one character.

eventRemarks

Comments or notes about the event. In the sheet the cells are validated to contain text with a length of more than one character. Example: After the recent rains, the river is nearly overflowing.

waterBody

The name of the water body in which the Location occurs. In the sheet is validated that cell have a text values referering a geographic water body. Example: Indian Ocean, Baltic Sea

country

The name of the country or major administrative unit in which the location occurs. In the sheet the controlled vocabulary is validated.

stateProvince

The name of the next smaller administrative region than country. In the sheet the controlled vocabulary is validated.

county

The full, unabbreviated name of the next smaller administrative region than stateProvince (county, shire, department, etc.) in which the Location occurs.

minimumElevationInMeters

The lower limit of the range of elevation (altitude, usually above sea level), in meters. In the sheet is validated that cell only have integer values between -100 and 8000. Example: 100

maximumElevationInMeters

The upper limit of the range of elevation (altitude, usually above sea level), in meters. In the sheet is validated that cell only have integer values between -100 and 8000. Example: 200

minimumDepthInMeters

The lower limit of the range of depth below the local surface, in meters. In the sheet is validated that cells only have contain integer values between -100 (above sea level) and 900 (under sea level). Example: 100

maximumDepthInMeters

The greater depth of a range of depth below the local surface, in meters. In the sheet is validated that cell only have integer values between -100 (above sea level) and 900 (under sea level). Example: 200

decimalLatitude

The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. In the sheet is validated that cell value is a decimal number between 0 and +/-90 with decimal digits and point for decimal separator. Example: -41.0983423

decimalLongitude

The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. In the sheet is validated that cell value is a decimal number between 0 and +/-180 with decimal digits and point for decimal separator. Example: -121.1761111

coordinateUncertaintyInMeters

The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Location. If the uncertainty is unknown, the cell will be empty.

coordinatePrecision

A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitude. Examples: 0.00001 (normal GPS limit for decimal degrees), 0.000278 (nearest second), 0.01667 (nearest minute), 1.0 (nearest degree)


Types of validations

Without content

If the cell is empty, it will be marked with error. (It will be colored with its correspondent color)

Content not conform to controlled vocabulary

If the value in the cell is not equal to one of the values in the controlled vocabulary.

Invalid date or in incorrect format

If the value in the cell does not correspond to norm ISO:8601 2004 in extended form, or if data is invalid (e.g February 29 in a leap year or date 31 for a month with only 30 days) http://dotat.at/tmp/ISO_8601-2004_E.pdf

Invalid time or in incorrect format

If the value in the cell does not correspond to norm ISO:8601 2004 in extended form, or if time is invalid (24 hours). E.g 25.00 is a not valid time.15:69. http://dotat.at/tmp/ISO_8601-2004_E.pdf

Text with anomalous characters

If the values entered in the cell have anomalous characters, this will be marked with error. Different institution identifiers for the same collection, Values for institution id and collection id are related. Records of a collection must have a unique institution identifier. If a value in the collection identifier already stands in the column, the cell value for the institution identifier in the same row must be equal to the previous identifier value, otherwise the cell will be marked with error.

Codes with anomalous characters

If the values for institutionCode and collectionCode cells have anomalous characters, these will be marked with error.

Different institutionCode for a institutionID

The Code for the institution must correspond to a unique institution identifier. If in a record, the cell value for the institution code differs from the previously recorded identifier, this cell will be marked with error.

Different collectionCode for a collectionID

The code for collection must correspond to a unique collection identifier. If in a record, the cell value for the collection code differs from the previously recorded identifier, this cell will be marked with error.

Different catalogNumber for a collection

The values for catalogueNumber and collectionId are related, and records of a collection must have a unique catalogueNumber. If the value of the catalogueNumber stands already in the column, the cell value for the collectionId in the same row must be equal to the previous identifier value, otherwise the cell will be marked with error.

No valid value or with with anomalous characters in list separate by (;)

In the cell, it is validated that text has a list format separated by (;) and does not have anomalous characters that can cause difficulties while indexing.

Not a positive integer

If the cell value is not a positive integer, this will be marked with error.

Not a integer or not in the interval of valid values: Elevation

If the cell value is not a positive integer, or not in the interval of valid values for elevation, this will be marked with error.

Not a integer or not in the interval of valid values: Depth

If the cell value is not a positive integer, or not in the interval of valid values for depth, this will be marked with error.

Not correspond to a decimal latitude

If the cell value is not a decimal number between 0 and +/-90 with 6 digits after decimal point, this will be marked with error.

Not correspond to a decimal longitude

If the cell value is not a decimal number between 0 and +/-180 with 6 digits after decimal point, this will be marked with error.

Not valid value for uncertainty

The value for uncertainty can be empty if it is unknown, and if it is not possible to estimate or if it is not applicable. “30” is a valid value (reasonable lower limit of a GPS reading under good conditions if the actual precision was not recorded at the time), but “0” is not a valid value. If the cell value does not range between 1 and 8000, the cell will be marked with error.

Not valid value for coordinate precision

If the cell value is not a number with 0-6 digits after the decimal point, the cell will be marked with error.

Not correspond to scientific name format

If the cell value is a text that does not correspond with a scientific name (first letter in uppercase, no more than two white spaces between genus and specie, neither between species and autor/s, neither in the end of the field) this will be marked with error.

scientificNameAuthorship with anomalous characters

If the value entered in the cell have anomalous characters, this will be marked with error.

Geographical location not according to entered coordinates

If values for country, state, province or county do not correspond to a geographical location retrieved for decimal longitude and decimal latitude, the cell will be marked with error.

Select language for validations (English/Español)

In the menu, select the required language. By default, the script will make validations in English. Once you have selected the language, it is necessary to reload the window/tab of the spreadsheet, to see the different options in the language of your choice.

If the sheet contains too many records, it can occur that the sheet stays gray after finishing with the validations. In this case, please reload the window/ tab of the spreadsheet in your browser.