Important elements to validate - SIB-Colombia/sib-google-spreadsheet-validator GitHub Wiki
Important elements to validate
These elements are considered important in a Darwin Core record. Each element is recorded in a column of the sheet, with the name of the element in the first cell of each column. Therefore, the first row of the sheet contains the names of the elements to validate.
If the sheet contains too many records, after finishing the validations it can occur that the sheet stays gray. In this case, please reload the window/ tab of the spreadsheet in your browser.
occurrenceID
In the absence of a persistent global unique identifier, you can create one by a combination of identifiers in the record that will most closely make the occurrenceID globally unique. For a specimen in the absence of a bona fide global unique identifier, use for example the form:"urn:catalog:[institutionCode]:[collectionCode]:[catalogNumber]. In the sheet is validated that value in each cell is unique.
basisOfRecord
The specific nature of the data record - a subtype of the dcterms:type
modified
The most recent date-time on which the resource was changed. The sheet is validated according to standard ISO8601:2004. Example: "2007-03-01T13:00:00Z/2008-05-11T15:30:00Z"
rightsHolder
A person or organization owning or managing rights over the resource. It is validated that the cells contain only alphanumeric characters.
Example: University of California.
institutionID
An identifier for the institution having custody of the object(s) or information referred to in the record. It is validated that the cells contain only alphanumeric characters.
collectionID
An identifier for the collection or dataset from which the record was derived. For physical specimens, the recommended best practice is to use the identifier in a collections registry such as the Biodiversity Collections Index. It is validated that the cells contain only alphanumeric characters.
Example: urn:lsid:biocol.org:col:34818
collectionCode
The name, acronym, code, or initials identifying the collection or data set from which the record was derived. It is validated that the cells contain only alphanumeric characters.
Example: "COL", "ANDES", "FMB", "HPUJ".
datasetName
The name identifying the data set from which the record was derived. It is validated that the cells contain only alphanumeric characters.
Example: Muestreo de Mamíferos de la Cuenca.
catalogNumber
An identifier (preferably unique) for the record within the data set or collection. In the sheet is validated that value in each cell is unique (no repeated values in the column). Example: 2008.1334
recordedBy
A list (concatenated and separated for “;”) of names of people, groups, or organizations responsible for recording the original Occurrence. In the sheet is validated that cell only have alphanumeric characters. Example: Oliver P. Pearson; Anita K. Pearson
individualCount
The number of individuals represented present at the time of the Occurrence. In the sheet is validated that cell only have integer values. Examples: 1, 25
sex
The sex of the biological individual(s) represented in the Occurrence. In the sheet is validated in base to controlled vocabulary. Example: female, hermaphrodite
samplingProtocol
The name of, reference to, or description of the method or protocol used during an event. In the sheet the cells are validated to contain text with a length of more than one character.
eventDate
The date-time or interval during which an event occurred. The sheet is validated according tostandard ISO8601:2004. Example: 1963-03-08T14:07-0600
eventTime
The time or interval during which an event occurred. The sheet is validated according to standard ISO8601:2004. Example: 14:07-0600
habitat
A category or description of the habitat in which the event occurred. In the sheet the cells are validated that cell have a text with a length more than one character. Example: pre-cordilleran steppe
fieldNumber
An identifier given to the event in the field. Often serves as a link between field notes and the Event. In the sheet the cells are validated to contain text with a length of more than one character.
eventRemarks
Comments or notes about the event. In the sheet the cells are validated to contain text with a length of more than one character. Example: After the recent rains, the river is nearly overflowing.
waterBody
The name of the water body in which the Location occurs. In the sheet is validated that cell have a text values referering a geographic water body. Example: Indian Ocean, Baltic Sea
country
The name of the country or major administrative unit in which the location occurs. In the sheet the controlled vocabulary is validated.
stateProvince
The name of the next smaller administrative region than country. In the sheet the controlled vocabulary is validated.
county
The full, unabbreviated name of the next smaller administrative region than stateProvince (county, shire, department, etc.) in which the Location occurs.
minimumElevationInMeters
The lower limit of the range of elevation (altitude, usually above sea level), in meters. In the sheet is validated that cell only have integer values between -100 and 8000. Example: 100
maximumElevationInMeters
The upper limit of the range of elevation (altitude, usually above sea level), in meters. In the sheet is validated that cell only have integer values between -100 and 8000. Example: 200
minimumDepthInMeters
The lower limit of the range of depth below the local surface, in meters. In the sheet is validated that cells only have contain integer values between -100 (above sea level) and 900 (under sea level). Example: 100
maximumDepthInMeters
The greater depth of a range of depth below the local surface, in meters. In the sheet is validated that cell only have integer values between -100 (above sea level) and 900 (under sea level). Example: 200
decimalLatitude
The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. In the sheet is validated that cell value is a decimal number between 0 and +/-90 with decimal digits and point for decimal separator. Example: -41.0983423
decimalLongitude
The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. In the sheet is validated that cell value is a decimal number between 0 and +/-180 with decimal digits and point for decimal separator. Example: -121.1761111
coordinateUncertaintyInMeters
The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Location. If the uncertainty is unknown, the cell will be empty.
coordinatePrecision
A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitude. Examples: 0.00001 (normal GPS limit for decimal degrees), 0.000278 (nearest second), 0.01667 (nearest minute), 1.0 (nearest degree)
Types of validations
Without content
If the cell is empty, it will be marked with error. (It will be colored with its correspondent color)
Content not conform to controlled vocabulary
If the value in the cell is not equal to one of the values in the controlled vocabulary.
Invalid date or in incorrect format
If the value in the cell does not correspond to norm ISO:8601 2004 in extended form, or if data is invalid (e.g February 29 in a leap year or date 31 for a month with only 30 days) http://dotat.at/tmp/ISO_8601-2004_E.pdf
Invalid time or in incorrect format
If the value in the cell does not correspond to norm ISO:8601 2004 in extended form, or if time is invalid (24 hours). E.g 25.00 is a not valid time.15:69. http://dotat.at/tmp/ISO_8601-2004_E.pdf
Text with anomalous characters
If the values entered in the cell have anomalous characters, this will be marked with error. Different institution identifiers for the same collection, Values for institution id and collection id are related. Records of a collection must have a unique institution identifier. If a value in the collection identifier already stands in the column, the cell value for the institution identifier in the same row must be equal to the previous identifier value, otherwise the cell will be marked with error.
Codes with anomalous characters
If the values for institutionCode and collectionCode cells have anomalous characters, these will be marked with error.
Different institutionCode for a institutionID
The Code for the institution must correspond to a unique institution identifier. If in a record, the cell value for the institution code differs from the previously recorded identifier, this cell will be marked with error.
Different collectionCode for a collectionID
The code for collection must correspond to a unique collection identifier. If in a record, the cell value for the collection code differs from the previously recorded identifier, this cell will be marked with error.
Different catalogNumber for a collection
The values for catalogueNumber and collectionId are related, and records of a collection must have a unique catalogueNumber. If the value of the catalogueNumber stands already in the column, the cell value for the collectionId in the same row must be equal to the previous identifier value, otherwise the cell will be marked with error.
No valid value or with with anomalous characters in list separate by (;)
In the cell, it is validated that text has a list format separated by (;) and does not have anomalous characters that can cause difficulties while indexing.
Not a positive integer
If the cell value is not a positive integer, this will be marked with error.
Not a integer or not in the interval of valid values: Elevation
If the cell value is not a positive integer, or not in the interval of valid values for elevation, this will be marked with error.
Not a integer or not in the interval of valid values: Depth
If the cell value is not a positive integer, or not in the interval of valid values for depth, this will be marked with error.
Not correspond to a decimal latitude
If the cell value is not a decimal number between 0 and +/-90 with 6 digits after decimal point, this will be marked with error.
Not correspond to a decimal longitude
If the cell value is not a decimal number between 0 and +/-180 with 6 digits after decimal point, this will be marked with error.
Not valid value for uncertainty
The value for uncertainty can be empty if it is unknown, and if it is not possible to estimate or if it is not applicable. “30” is a valid value (reasonable lower limit of a GPS reading under good conditions if the actual precision was not recorded at the time), but “0” is not a valid value. If the cell value does not range between 1 and 8000, the cell will be marked with error.
Not valid value for coordinate precision
If the cell value is not a number with 0-6 digits after the decimal point, the cell will be marked with error.
Not correspond to scientific name format
If the cell value is a text that does not correspond with a scientific name (first letter in uppercase, no more than two white spaces between genus and specie, neither between species and autor/s, neither in the end of the field) this will be marked with error.
scientificNameAuthorship with anomalous characters
If the value entered in the cell have anomalous characters, this will be marked with error.
Geographical location not according to entered coordinates
If values for country, state, province or county do not correspond to a geographical location retrieved for decimal longitude and decimal latitude, the cell will be marked with error.
Select language for validations (English/Español)
In the menu, select the required language. By default, the script will make validations in English. Once you have selected the language, it is necessary to reload the window/tab of the spreadsheet, to see the different options in the language of your choice.
If the sheet contains too many records, it can occur that the sheet stays gray after finishing with the validations. In this case, please reload the window/ tab of the spreadsheet in your browser.