General data standards - atlanticcanadacdc/template GitHub Wiki

People names

Date standard issued: 200601

Last name, First name with additional observers in the same form separated by a semicolon

Examples:

  1. Blaney, Sean; Robinson, Sarah; Mazerolle, David
  2. Blaney, Sean; et al.
  3. unknown

Note: unknown should only be used if the observer is confidently known to be unknown. If it is just not given in the received data, use a blank

  1. <blank>
  2. If the first name or last name is missing, use a “?” to indicate a missing name. For example, Cochran, ? for a missing first name.
  3. Atlasser ID: <number>
  4. If 4 observers use all 4, if 5, list 3 and use ; et al.
  5. iNat usernames: if they have contributed many records try to track down observer name. If few records use iNaturalist user “x” (we could create ‘username-to-real name’ lookup table)
  6. If many observer names are provided but are all formatted to AC CDC standards (above) include all names
  7. If many observers, and they are not formatted to CDC standards, limit to 3 formatted names and group others as below:
  • Blaney, Sean; Robinson, Sarah; Mazerolle, David, plus 8 observers

Exceptions:

  1. If the observer goes by initials instead of a first name, use initials with each initial followed by a period
  • E.g., Lautenschlager, R.A.; Klymko, John
  1. When data is received in initialled format (Klymko, J.D.) they can be entered as such unless easy to convert to full names

Dates

Date standard issued: 13 May, 2019

Expressing date imprecision

  • use (uppercase) X values as placeholders for unknown elements of a date (eg., 201X 05 XX)

Date fields

  • a total of 7 date fields are in FileMaker and in our data entry spreadsheets:

obDATE1 [internal use only], YYYY, MM, and DD

  • first, or only date observed
  • user can provide data in OBDATE format or in separate YYYY, MM, DD fields; all four values will be uploaded

obDATE2 [internal use only]

  • the last date observed or blank
  • since we very seldom have date range data, it might not be worth exporting obDATE2 with all exports since it will almost always be empty (see obDATE below)
  • user can provide data in OBDATE format or in separate YYYY2, MM2, DD2 fields

obDATEverbatim [currently internal only]

  • this can hold any text description of the date or details about uncertainty
  • e.g., spring 2010, collected sometime between 4 June and 23 July
  • we can decide as a team whether this needs to be exported with data (it may be used more commonly than obDATE2, but likely still not very often)
  • alternatively, this could be concatenated into the GENCOM field (but we are trying to get rid of that concatenated field eventually)

obDATE [sent to data requestors]

  • since obDATE2 will most often be an empty exported field, obDATE will instead be generated automatically from obDATE1 and obDATE2 fields and be exported for all outgoing data

Special characters and qualifiers

  • {~,<,>, around, approx.}
  • to make date values more usable, these should not be used; uncertainty and ranges would get incorporated in the fields above

Text data

Be as concise as possible, but do not sacrifice comprehension of the text (e.g., will data users understand the meaning?). Use lowercase text. Avoid acronyms and abbreviations, when possible.

Example 1

  • Not concise: Location uncertainty estimated by data manager based on GPS accuracy of an iPhone
  • Concise: LOCUNCM estimated by AC CDC; phone GPS

Example 2

  • Coordinates taken from observer's phone
  • coordinates derived from phone

Internal data standards

When adding additional information to an existing record, in any field, denote the added information with a string encased in square brackets and include your initials & date of the change.

Examples: [site visited by Pamela Mills and Claire Wilson O'Driscoll in 2020: tree was vegetative; JEP 2023 04 17]

Standard texts

  1. When AC CDC directions script is used for directions append:

Directions generated by AC CDC script

e.g., wpt 345, wpt 345-356; Directions generated by AC CDC script

  1. When SURVEYSITE is assigned by CDC, append:

SURVEYSITE set by AC CDC as nearest CGNDB locality

  1. When estimated by the observer, use:

estimated by observer

  1. When estimated by AC CDC, use:

estimated by AC CDC

  1. To clarify NOTELOCcoordinates once it is concatenated, use:

coordinates derived from

  1. To clarify NOTELOClocuncm once it is concatenated, use:

LOCUNCM estimated by

⚠️ **GitHub.com Fallback** ⚠️